Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for combined test output #1088

Closed
wants to merge 22 commits into from

Conversation

Jake-Shadle
Copy link
Contributor

@Jake-Shadle Jake-Shadle commented Nov 2, 2023

This PR is branched off of #1086, but can be split out to its own PR separately, just wanted to show that this is working with all of the current tests, as well as allowing improved capabilities for better emulating libtest style output.

Basically, instead of capturing stdout and stderr into their own separate buffers, test output (other process capturing like test listing is not touched in this PR) is captured into a single interleaved buffer, where each chunk of data is recorded with a timestamp when it was received, its byte range in the interleaved buffer, and which stream it came from. With this information it's possible to satisfy all of the existing code and tests by reconstructing each individual stream.

The libtest format can then just used the interleaved buffer for its output, more closely matching the output of libtest.

{"type":"test","event":"failed","name":"myplayground::myplayground$fails_expectedly_with_wrong_value","exec_time":0.015041828,"stdout":"\nrunning 1 test\nthis is stdout\nthis is stderr\nthread 'fails_expectedly_with_wrong_value' panicked at src/lib.rs:52:5:\nfiddlesticks\nstack backtrace:\n   0: rust_begin_unwind\n             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5\n   1: core::panicking::panic_fmt\n             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14\n   2: myplayground::fails_expectedly_with_wrong_value\n             at ./src/lib.rs:52:5\n   3: myplayground::fails_expectedly_with_wrong_value::{{closure}}\n             at ./src/lib.rs:48:40\n   4: core::ops::function::FnOnce::call_once\n             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/ops/function.rs:250:5\n   5: core::ops::function::FnOnce::call_once\n             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/ops/function.rs:250:5\nnote: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.\ntest fails_expectedly_with_wrong_value - should panic ... FAILED\n\nfailures:\n\n---- fails_expectedly_with_wrong_value stdout ----\nnote: panic did not contain expected string\n      panic message: `\"fiddlesticks\"`,\n expected substring: `\"fiddlesticksy\"`\n\nfailures:\n    fails_expectedly_with_wrong_value\n\ntest result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 7 filtered out; finished in 0.01s\n\n"}

Copy link

codecov bot commented Nov 2, 2023

Codecov Report

Attention: 214 lines in your changes are missing coverage. Please review.

Comparison is base (1f45396) 78.21% compared to head (d94785d) 78.13%.

Files Patch % Lines
nextest-runner/src/test_command/unix.rs 0.00% 103 Missing ⚠️
nextest-runner/src/reporter/structured/libtest.rs 48.93% 48 Missing ⚠️
nextest-runner/src/test_output.rs 94.08% 22 Missing ⚠️
nextest-runner/src/reporter/aggregator.rs 0.00% 20 Missing ⚠️
nextest-runner/src/runner.rs 75.00% 6 Missing ⚠️
nextest-runner/src/test_command/imp.rs 73.91% 6 Missing ⚠️
nextest-runner/src/list/test_list.rs 54.54% 5 Missing ⚠️
cargo-nextest/src/dispatch.rs 81.81% 2 Missing ⚠️
nextest-runner/src/helpers.rs 0.00% 1 Missing ⚠️
nextest-runner/src/reporter.rs 96.42% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1088      +/-   ##
==========================================
- Coverage   78.21%   78.13%   -0.09%     
==========================================
  Files          69       72       +3     
  Lines       17432    17971     +539     
==========================================
+ Hits        13635    14042     +407     
- Misses       3797     3929     +132     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jake-Shadle
Copy link
Contributor Author

Oh, looks like my assumption that MacOS would work since FIONREAD does work correctly on FreeBSD was...incorrect. Will fix tomorrow.

nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@NobodyXu NobodyXu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I sound rude, but I am honestly shocked to see unsafe and raw platform-dependent syscalls used instead of tokio to achieve async read.

nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
@sunshowers
Copy link
Member

Thanks for the PR!

I haven't looked at it yet but I have to ask again (#1086 (comment)):

For stdout/stderr, can we just do something like:

*** STDOUT ***
<stdout>
*** STDERR ***
<stderr>

@Jake-Shadle
Copy link
Contributor Author

We could, I'm just not sure why? The split output can be extremely confusing when looking at a failed test because the relationship between the streams is completely broken.

@sunshowers
Copy link
Member

Hmm, so I guess the same argument would apply to nextest's own output as well. Why should we not provide combined output across stdout and stderr? (Maybe we should!)

@Jake-Shadle
Copy link
Contributor Author

I mean, this PR enables either of those styles of output since it records which stream the output came from so that the libtest output can be combined and the rest of the code still functions as it does on main, so it wouldn't be a stretch to either convert it all or allow users to decide. The code is quite trivial.

@nuke-web3
Copy link

Trying out this PR (very excited to see this getting done!!) and I am a big fan of the separation of stdout and stderr - so just wanted to be sure that output would be the raw reporter message in stdout enabling easy capture/piping of it without necessarily writing the report to disk (as is the case with JUnit XML files, that are the only alt. approach for me to do report post processsing). It is also handy to be able to screen out stderr with all the logs from nextest and only print the raw report with `2> /dev/null.

I am working now to get a workflow that would enable a terminal (local or CI logs) to print out all details and also capturing JSON for further processing AFAIK. As the PR here works now, I am able to use the command cargo nextest r --message-format libtest-json -E"kind(test)" > test-report.json and almost get the exact behavior I want (terminal prints useful logs, json not printed, but is in stored in the piped file for integration test only).

One request: each item in the report is valid json, but the overall set of these events flushed to stdout IMHO should be a proper json array, so in my case I would not need to post process into a valid json file by wrapping these into an array. 🙏 That seem reasonable? As it is now you get:

{"type":"suite",...}
{"type":"test",...}
...

But I would want:

[
{"type":"suite",...},
{"type":"test",...},
...
]

@Jake-Shadle
Copy link
Contributor Author

One request: each item in the report is valid json, but the overall set of these events flushed to stdout IMHO should be a proper json array, so in my case I would not need to post process into a valid json file by wrapping these into an array. 🙏 That seem reasonable? As it is now you get:

That's not how libtest works, I think if you want an improved JSON/etc output you should comment on #20 about what format would suit you, #1086 set it up so that adding new machine readable formats should be easy.

@sunshowers
Copy link
Member

sunshowers commented Nov 8, 2023

(I think newline-delimited JSON is correct and other future formats should also use it.)

@Jake-Shadle Jake-Shadle marked this pull request as ready for review December 12, 2023 11:18
@Jake-Shadle
Copy link
Contributor Author

I've marked this as ready for review as I think this is good enough to bring in, especially considering that the libtest-json is marked as experimental.

The test output is now always combined, and for non-json test output still preserves the current behavior of emitting stdout and stderr parts separately, passing all tests, however the combined output is now used as the stdout value when using --message-format libtest-json for failed tests, including the filtering out of the pieces of data emitted by libtest in the human format that are not part of the output when using the libtest json format.

The TestOutput additionally has a lines iterator that can help in future formats that may want to emit the output itself in a structured way, with eg. timing data, stdout vs stderr, etc.

@sunshowers
Copy link
Member

Thanks. This makes sense.

Copy link
Member

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments and then this is good to go. Thanks!

nextest-runner/src/reporter/structured/libtest.rs Outdated Show resolved Hide resolved
nextest-runner/src/test_output.rs Outdated Show resolved Hide resolved
@Jake-Shadle Jake-Shadle marked this pull request as draft December 20, 2023 15:19
@Jake-Shadle
Copy link
Contributor Author

FYI still working on this, while adding some tests I noticed that my naive assumption that combining 2 different streams would mostly be correct in most situations was completely wrong on both linux and windows (and most likely mac/bsd/etc) so I've fixed that for unix and am working on the Windows implementation now, just converted to draft to be more clear of the state.

@Jake-Shadle Jake-Shadle marked this pull request as ready for review December 20, 2023 16:28
@Jake-Shadle
Copy link
Contributor Author

Oh, that was easier than I thought it would be.

Basically, we now have 3 separate CaptureStrategy:

  • None - Used when --no-capture is used, stdout/stderr are not piped
  • Split - Used when not outputting JSON/structured data, this is the same as the current code, which is that each stream is captured separately, so when combined in a single buffer they will ordered correctly only in the same stream, but may have quite a different ordering if streams are mixed
  • Combined - Used when outputting JSON, I went for the most minimally invasive change without changing too much of the current code around process spawning and output capturing, by configuring the Command::stdout/err to use the same pipe (pipe on unix, named pipe/file on windows) and then keeping the single read end as the output from the program. This usage of a single pipe/file for the child process means that the output is as accurate as it can be, but does come with the consequence that the output is always attributed to stdout since we can no longer differentiate between streams.

This is using methods that were only just added in 1.74.0 which is why all of the 1.70.0 checks are failing. If moving to that MSRV isn't acceptable then I can remove the combined output capture, it would just need to be noted on the libtest output that the interleaved output in the stdout field can be wildly inaccurate even in simple situations where stdout and stderr printing are mixed together.

Note I also replaced windows with windows-sys, I hadn't noticed this crate was the sole user of windows in the crate graph, there was no reason for using windows since no COM is used by this crate, and windows is ridiculously larger (11.8MiB uncompressed) and slower to compile than windows-sys. If you want to keep windows I can revert that change.

@sunshowers
Copy link
Member

FYI still working on this, while adding some tests I noticed that my naive assumption that combining 2 different streams would mostly be correct in most situations was completely wrong on both linux and windows (and most likely mac/bsd/etc)

Could you elaborate on what made this completely wrong? At a high level it seems like it would be fine, given that tokio::select! randomly picks the next future to poll from.

Also re Rust 1.74, stderr and stdout are just file descriptors (HANDLEs on Windows) so we should be able to convert them into OwnedFd/OwnedHandle and from there to File instances, unless I'm missing something. We can definitely bump up the MSRV to 1.72 right now, but based on my read of the docs that shouldn't be necessary.

Replacing windows with windows-sys is fine and was something I'd been wanting to do at some point, so thanks for doing that!

@Jake-Shadle
Copy link
Contributor Author

Could you elaborate on what made this completely wrong? At a high level it seems like it would be fine, given that tokio::select! randomly picks the next future to poll from.

Given something like this:

#[test]
fn ordering_is_hard() {
    eprintln!("first stderr");
    println!("this is stdout");
    eprintln!("second stderr");
    panic!("oh no");
}

Running this test with output capturing would basically give something like

first stderr\nsecond stderr\nthis is stdout\n<panic output> about 80% of the time, at least on my Linux machine. In fact in some cases the this is stdout\n would even appear within the <panic output> as it is written in multiple separate writes between the panic header, the panic message, and the potential backtrace. It was only correctly ordered infrequently. I noticed the same issue on Windows, so figured it would be best to fix this in the PR rather than get bugs reported on the libtest output being wrong.

Also re Rust 1.74, stderr and stdout are just file descriptors (HANDLEs on Windows) so we should be able to convert them into OwnedFd/OwnedHandle and from there to File instances, unless I'm missing something. We can definitely bump up the MSRV to 1.72 right now, but based on my read of the docs that shouldn't be necessary.

Oh yes, that will probably work, I went for the "match std exactly" option initially just to prove out the solution without needing to worry about "minor" divergences making it not work for non-obvious reasons. I'll do that after break.

Replacing windows with windows-sys is fine and was something I'd been wanting to do at some point, so thanks for doing that!

Excellent.

@NobodyXu
Copy link
Contributor

Maybe you can just set stdout and stderr to point to the same pipe?

That should make things a lot easier, isn't it?

@Jake-Shadle
Copy link
Contributor Author

That's what this change does.

Checkpointing so I can check Windows impl on an actual windows machine
Adds basic tests for TestOutput, particularly the line iterator
While adding tests for the libtestjson output, I noticed that my assumption that stdout/stderr ordering would be _mostly_ correct except in "extreme" scenarios was...completely wrong.

With even the most simple mixed stdout/stderr output (no multithreading, no unbuffered output, no huge single writes) that the ordering of the output was wrong on my machine more often than not. After playing around with the idea of using a pseudo terminal, similar to how alacritty or other terminal emulators work, I checked the even easier case of...just passing the same file descriptor for both stdout and stderr...which works.

Committing this in a separate branch so that I can push and add the Windows implementation before merging this back to the combined output branch.
@Jake-Shadle
Copy link
Contributor Author

Also re Rust 1.74, stderr and stdout are just file descriptors (HANDLEs on Windows) so we should be able to convert them into OwnedFd/OwnedHandle and from there to File instances, unless I'm missing something. We can definitely bump up the MSRV to 1.72 right now, but based on my read of the docs that shouldn't be necessary.

Done.

@sunshowers
Copy link
Member

I made some changes to prep this to land, but I wasn't able to push to this PR. So I created a new PR #1212 for this.

Thanks for all the work!

@sunshowers sunshowers closed this Jan 9, 2024
@sunshowers
Copy link
Member

I'm planning to release a version of nextest containing support for this within the next 1-2 days.

@sunshowers
Copy link
Member

All right, this is part of cargo-nextest 0.9.67 which will be out within 20 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants