transcripts do not catch errors #1185

duck-rh · 2021-12-22T13:51:30Z

Quack,

I'm using cmd2 2.3.3 on Python 3.9.9 on Debian unstable.

I am trying to test my application using transcripts. Depending on the gravity of the situation it uses pwarning() or perror() to alert the user but these calls are not caught by transcript and looking at the code of _generate_transcript is seems only stdout is captured.

I believe testing application checks are useful and in most test framework you can catch exceptions; would you consider capturing stderr too in the transcript?

Regards.
\_o<

The text was updated successfully, but these errors were encountered:

crotwell · 2022-02-24T00:42:17Z

Possible this could be resolved by PR #1208
Also see issue #1207

kmvanbrunt · 2022-02-24T16:12:28Z

@tleonhardt @anselor
Can you think of a reason why capturing stderr would be problematic?

anselor · 2022-02-24T16:48:28Z

@kmvanbrunt The transcripts feature is an area of cmd2 I haven't really touched. At a high level I see no reason why it would be a problem.

crotwell · 2022-02-28T14:14:27Z

@kmvanbrunt I would be happy to see a more robust solution, but currently cmd2 transcripts do not actually capture stdout, they only capture the output of self.poutput().

In other words, if you have two commands like:

    def do_print(self, statement):
        """print something"""
        print("printing something")

    def do_poutput(self, statement):
        """poutput something"""
        self.poutput("poutput something")

and run you see this:

(Cmd) print
printing something
(Cmd) poutput
poutput something

but if you create a transcript with:

(Cmd) history -t transcript.txt -- -2:

then the transcript.txt looks like:

(Cmd) print
(Cmd) poutput
poutput something

so the print output is missing.

If you can figure out how to actually capture stdout, that would be great. But if not, maybe capturing self.perror() as in PR #1208 is at least sufficient so transcripts fail if there are exceptions raised?

kmvanbrunt · 2022-02-28T15:43:19Z

@crotwell Transcripts capture everything written to self.stdout, which is not limited to self.output() calls.

Transcripts also already capture self.stderr via a cmd2.utils.StdSim and unittest.TextTestRunner. That data is printed to sys.stderr when a test is successful. Upon failure, the unittest class also writes to sys.stderr data describing why the test failed.

See this code here: https://github.com/python-cmd2/cmd2/blob/master/cmd2/cmd2.py#L4998

Your original PR probably has the simplest approach for filtering user-printed messages from other sys.stderr data, but should these cases be considered failed tests if you are expecting the error text?

@duck-rh Can you clarify why you want sys.stderr captured? Are you using transcripts like unit tests to exercise good and bad code paths?

If we start evaluating both streams in a transcript test, what happens when a test case writes to both streams? How will the transcript differentiate what text in a transcript belongs to self.stdout and what text belongs to sys.stderr? I believe the approach by @crotwell addresses this concern by storing user-printed error messages in another class member, but it limits transcripts to treating all error messages as failed tests.

@tleonhardt Thoughts?

crotwell · 2022-02-28T16:47:04Z

OK, I see the clarification. You said capturing stderr, but sounds like you actually meant capturing self.stderr.

I do not care about stderr being in the transcript, intermingling stdout and stderr is probably hard and error prone. And I guess there could be cases where self.stderr was not an actual failure, so perror is not the right way to go. As I said in issue #1207, what I actually care about is that a transcript test should not "pass" if there is a failure situation in cmd2. In particular, a test of a transcript that contains a command that does not actually exist should not report "ok". If it was easier or less invasive to fail on exceptions or other error handling instead of stderr, that is probably better.

Basically, I went with the simple perror() solution just because onecmd() calls default() which just sends the not a recognized command message directly to perror here, ie no exception raised:

cmd2/cmd2/cmd2.py

Line 2871 in 5ce3a64

self.perror(err_msg, apply_style=False)

Currently there is an odd mixture of sometimes "bad things" are exceptions and sometimes they are just perror() prints, making it a lot harder for the transcript tests to know when something failed. Perhaps onecmd and default should not capture the unknown command failure situation, instead raise an exception that can be caught at a higher level, such as by the transcript tests. This may also require refactoring onecmd_plus_hooks to separate running a command from the error handling. For a user typing commands, perror() is the right error handling, but for running a transcript as a test it might not be.

kmvanbrunt · 2022-02-28T17:17:25Z

cmd2 does not have a self.stderr member like it does self.stdout and self.stdin which are inherited from Python's cmd library. We capture sys.stderr during transcripts, command redirection/piping, and pyscripts.

The question I have is about the purpose of transcripts. I've not really used them nor have I contributed much to the transcript code, so my thoughts here might be incorrect.

It seems like transcripts are about capturing expected output. When you generate them using history -t, it writes the output of a command line to a transcript file. If I type a command that does not exist, then I expect nothing in self.stdout and a specific error string in sys.stderr. If that happens, then I would call it a success because actual output matched expected output. If anything else happens which does not match the transcript, then I would call it a failure.

crotwell · 2022-02-28T17:48:28Z

That purpose for transcripts seems correct to me.

But it is unclear to me if a transcript of something that writes to both stdout and stderr would be usable as a test in that how stdout and stderr are intermingled could be non-deterministic. It might be worth trying, but I don't know enough about python internals to know if you might effectively create a race condition between stdout and stderr if you try to combine them, especially in cases where there was a call to the shell.

duck-rh · 2022-03-01T04:32:29Z

@kmvanbrunt yes, that's exactly it, I'm running unit tests. I don't see any other way to test the UI part of my application. I create a complete test environment and I want to make sure the flow of a command ends up with the proper result, whether it's a result or an error. I also have scenarios with a suite of commands having a specific outcome. You can have a look at what it looks like here: https://vcs-git.duckcorp.org/duckcorp/ldapwalker (branch test).

anselor · 2022-03-01T05:14:23Z

@duck-rh
Have you looked at the external test plug-in?
https://github.com/python-cmd2/cmd2/tree/master/plugins/ext_test

duck-rh · 2022-03-01T09:32:09Z

@anselor thanks for pointing this out I must have missed it. I made a quick test and that seem to work fine.

I noticed that plugins are not included in the distribution and I had to copy the module in my sources to make it work. Would it be possible to distribute them?

anselor · 2022-03-01T14:53:18Z

@duck-rh It's distributed as a separate wheel you can import for your tests.
https://pypi.org/project/cmd2-ext-test/

crotwell · 2022-03-01T16:40:14Z

PR #1210 I think addresses part of this, at least my concern about a transcript with a non-existant command returning "ok". Does not take care of capturing stderr when creating a transcript with history, not sure how important that is. Basically I borrowed code from cmd2-ext-test to use pybridge to grab stderr and concat it with stdout for the result. Simple concat might not be the right way to go, but at least something is there.

Anyway, hope this helps...

This was referenced Feb 24, 2022

transcripts do not fail on missing commands #1207

Closed

No cmd fail transcript #1208

Closed

crotwell mentioned this issue Feb 28, 2022

run_script continues after not a recognized command #1209

Open

crotwell mentioned this issue Mar 1, 2022

use pybridge to capture stderr for transcript tests #1210

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transcripts do not catch errors #1185

transcripts do not catch errors #1185

duck-rh commented Dec 22, 2021

crotwell commented Feb 24, 2022

kmvanbrunt commented Feb 24, 2022

anselor commented Feb 24, 2022

crotwell commented Feb 28, 2022

kmvanbrunt commented Feb 28, 2022

crotwell commented Feb 28, 2022

kmvanbrunt commented Feb 28, 2022

crotwell commented Feb 28, 2022

duck-rh commented Mar 1, 2022

anselor commented Mar 1, 2022

duck-rh commented Mar 1, 2022

anselor commented Mar 1, 2022

crotwell commented Mar 1, 2022

transcripts do not catch errors #1185

transcripts do not catch errors #1185

Comments

duck-rh commented Dec 22, 2021

crotwell commented Feb 24, 2022

kmvanbrunt commented Feb 24, 2022

anselor commented Feb 24, 2022

crotwell commented Feb 28, 2022

kmvanbrunt commented Feb 28, 2022

crotwell commented Feb 28, 2022

kmvanbrunt commented Feb 28, 2022

crotwell commented Feb 28, 2022

duck-rh commented Mar 1, 2022

anselor commented Mar 1, 2022

duck-rh commented Mar 1, 2022

anselor commented Mar 1, 2022

crotwell commented Mar 1, 2022