Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transcripts do not catch errors #1185

Open
duck-rh opened this issue Dec 22, 2021 · 13 comments
Open

transcripts do not catch errors #1185

duck-rh opened this issue Dec 22, 2021 · 13 comments

Comments

@duck-rh
Copy link

duck-rh commented Dec 22, 2021

Quack,

I'm using cmd2 2.3.3 on Python 3.9.9 on Debian unstable.

I am trying to test my application using transcripts. Depending on the gravity of the situation it uses pwarning() or perror() to alert the user but these calls are not caught by transcript and looking at the code of _generate_transcript is seems only stdout is captured.

I believe testing application checks are useful and in most test framework you can catch exceptions; would you consider capturing stderr too in the transcript?

Regards.
\_o<

@crotwell
Copy link

Possible this could be resolved by PR #1208
Also see issue #1207

@kmvanbrunt
Copy link
Member

@tleonhardt @anselor
Can you think of a reason why capturing stderr would be problematic?

@anselor
Copy link
Contributor

anselor commented Feb 24, 2022

@kmvanbrunt The transcripts feature is an area of cmd2 I haven't really touched. At a high level I see no reason why it would be a problem.

@crotwell
Copy link

@kmvanbrunt I would be happy to see a more robust solution, but currently cmd2 transcripts do not actually capture stdout, they only capture the output of self.poutput().

In other words, if you have two commands like:

    def do_print(self, statement):
        """print something"""
        print("printing something")

    def do_poutput(self, statement):
        """poutput something"""
        self.poutput("poutput something")

and run you see this:

(Cmd) print
printing something
(Cmd) poutput
poutput something

but if you create a transcript with:

(Cmd) history -t transcript.txt -- -2:

then the transcript.txt looks like:

(Cmd) print
(Cmd) poutput
poutput something

so the print output is missing.

If you can figure out how to actually capture stdout, that would be great. But if not, maybe capturing self.perror() as in PR #1208 is at least sufficient so transcripts fail if there are exceptions raised?

@kmvanbrunt
Copy link
Member

@crotwell Transcripts capture everything written to self.stdout, which is not limited to self.output() calls.

Transcripts also already capture self.stderr via a cmd2.utils.StdSim and unittest.TextTestRunner. That data is printed to sys.stderr when a test is successful. Upon failure, the unittest class also writes to sys.stderr data describing why the test failed.

See this code here: https://github.com/python-cmd2/cmd2/blob/master/cmd2/cmd2.py#L4998

Your original PR probably has the simplest approach for filtering user-printed messages from other sys.stderr data, but should these cases be considered failed tests if you are expecting the error text?

@duck-rh Can you clarify why you want sys.stderr captured? Are you using transcripts like unit tests to exercise good and bad code paths?

If we start evaluating both streams in a transcript test, what happens when a test case writes to both streams? How will the transcript differentiate what text in a transcript belongs to self.stdout and what text belongs to sys.stderr? I believe the approach by @crotwell addresses this concern by storing user-printed error messages in another class member, but it limits transcripts to treating all error messages as failed tests.

@tleonhardt Thoughts?

@crotwell
Copy link

OK, I see the clarification. You said capturing stderr, but sounds like you actually meant capturing self.stderr.

I do not care about stderr being in the transcript, intermingling stdout and stderr is probably hard and error prone. And I guess there could be cases where self.stderr was not an actual failure, so perror is not the right way to go. As I said in issue #1207, what I actually care about is that a transcript test should not "pass" if there is a failure situation in cmd2. In particular, a test of a transcript that contains a command that does not actually exist should not report "ok". If it was easier or less invasive to fail on exceptions or other error handling instead of stderr, that is probably better.

Basically, I went with the simple perror() solution just because onecmd() calls default() which just sends the not a recognized command message directly to perror here, ie no exception raised:

cmd2/cmd2/cmd2.py

Line 2871 in 5ce3a64

self.perror(err_msg, apply_style=False)

Currently there is an odd mixture of sometimes "bad things" are exceptions and sometimes they are just perror() prints, making it a lot harder for the transcript tests to know when something failed. Perhaps onecmd and default should not capture the unknown command failure situation, instead raise an exception that can be caught at a higher level, such as by the transcript tests. This may also require refactoring onecmd_plus_hooks to separate running a command from the error handling. For a user typing commands, perror() is the right error handling, but for running a transcript as a test it might not be.

@kmvanbrunt
Copy link
Member

cmd2 does not have a self.stderr member like it does self.stdout and self.stdin which are inherited from Python's cmd library. We capture sys.stderr during transcripts, command redirection/piping, and pyscripts.

The question I have is about the purpose of transcripts. I've not really used them nor have I contributed much to the transcript code, so my thoughts here might be incorrect.

It seems like transcripts are about capturing expected output. When you generate them using history -t, it writes the output of a command line to a transcript file. If I type a command that does not exist, then I expect nothing in self.stdout and a specific error string in sys.stderr. If that happens, then I would call it a success because actual output matched expected output. If anything else happens which does not match the transcript, then I would call it a failure.

@crotwell
Copy link

That purpose for transcripts seems correct to me.

But it is unclear to me if a transcript of something that writes to both stdout and stderr would be usable as a test in that how stdout and stderr are intermingled could be non-deterministic. It might be worth trying, but I don't know enough about python internals to know if you might effectively create a race condition between stdout and stderr if you try to combine them, especially in cases where there was a call to the shell.

@duck-rh
Copy link
Author

duck-rh commented Mar 1, 2022

@kmvanbrunt yes, that's exactly it, I'm running unit tests. I don't see any other way to test the UI part of my application. I create a complete test environment and I want to make sure the flow of a command ends up with the proper result, whether it's a result or an error. I also have scenarios with a suite of commands having a specific outcome. You can have a look at what it looks like here: https://vcs-git.duckcorp.org/duckcorp/ldapwalker (branch test).

@anselor
Copy link
Contributor

anselor commented Mar 1, 2022

@duck-rh
Have you looked at the external test plug-in?
https://github.com/python-cmd2/cmd2/tree/master/plugins/ext_test

@duck-rh
Copy link
Author

duck-rh commented Mar 1, 2022

@anselor thanks for pointing this out I must have missed it. I made a quick test and that seem to work fine.

I noticed that plugins are not included in the distribution and I had to copy the module in my sources to make it work. Would it be possible to distribute them?

@anselor
Copy link
Contributor

anselor commented Mar 1, 2022

@duck-rh It's distributed as a separate wheel you can import for your tests.
https://pypi.org/project/cmd2-ext-test/

@crotwell
Copy link

crotwell commented Mar 1, 2022

PR #1210 I think addresses part of this, at least my concern about a transcript with a non-existant command returning "ok". Does not take care of capturing stderr when creating a transcript with history, not sure how important that is. Basically I borrowed code from cmd2-ext-test to use pybridge to grab stderr and concat it with stdout for the result. Simple concat might not be the right way to go, but at least something is there.

Anyway, hope this helps...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants