Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(max-fail): introduce --max-fail runner option #1926

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

AJamesyD
Copy link
Contributor

@AJamesyD AJamesyD commented Nov 30, 2024

TL;DR

Introduce --max-fail flag. Inspired by pytest's own --maxfail flag.

Motivation

The --max-fail option provides a flexible middle-ground between the efficiency of --fail-fast and the comprehensive coverage of --no-fail-fast.

It allows users to uncover multiple issues without running the entire test suite, which is particularly valuable for tightening the write/test/debug iteration loop when running integration tests and/or large test suites.

Implementation

This PR

Replace all references to fail_fast: bool (except those in the profile config) with:

// Some(1) -> Equivalent to fail_fast = true
// Some(n) -> End run after n test failures
// None -> Equivalent to fail_fast = false / --no-fail-fast
self.max_fail: Option<usize>

If the maintainers believe it would be more readable and/or maintainable to introduce an Enum for max_fail I'm all ears. Would appreciate any other feedback you have as well.

Potential Future Work

  • Expose max_fail option in profile.
    • Potentially deprecate fail_fast option to minimize confusion / exposed surface area to same underlying interface.

Open Questions

  • Above note about creating a dedicated Enum for max_fail
  • If we're not relying on negating run_statuses.last_status().result.is_success(), should self.run_stats.exec_failed and self.run_stats.timed_out count towardsmax_fail?

@AJamesyD
Copy link
Contributor Author

Need to rebase and do some fixes, apologies

Copy link

codecov bot commented Nov 30, 2024

Codecov Report

Attention: Patch coverage is 72.58065% with 17 lines in your changes missing coverage. Please review.

Project coverage is 80.11%. Comparing base (f0d557a) to head (e9c8200).
Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
cargo-nextest/src/dispatch.rs 61.11% 7 Missing ⚠️
nextest-runner/src/config/max_fail.rs 78.78% 7 Missing ⚠️
nextest-runner/src/runner.rs 77.77% 2 Missing ⚠️
nextest-runner/src/reporter/displayer.rs 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1926      +/-   ##
==========================================
+ Coverage   79.79%   80.11%   +0.32%     
==========================================
  Files          81       83       +2     
  Lines       21002    21251     +249     
==========================================
+ Hits        16758    17026     +268     
+ Misses       4244     4225      -19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! I agree that this idea makes a lot of sense.

/// Number of tests that can fail before exiting test run
#[arg(
long,
name = "max-fail",
Copy link
Member

@sunshowers sunshowers Nov 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It would be interesting to also support a value here that's equivalent to --no-fail-fast, e.g. --max-fail=all or --max-fail=disabled.

  • Could you make this require_equals = true? For newer options we've been preferring that to avoid confusion with test filters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be interesting to also support a value here that's equivalent to --no-fail-fast, e.g. --max-fail=all or --max-fail=disabled

Great idea, will do!

Could you make this require_equals = true? For newer options we've been preferring that to avoid confusion with test filters.

Ack!

name = "max-fail",
value_name = "N",
conflicts_with_all = &["no-run", "no-fail-fast"],
overrides_with = "fail-fast"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there a reason this overrides --fail-fast? I would expect this to conflict with --fail-fast, not override it. (Unlike --no-fail-fast which is a clear negation.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm is there a reason this overrides --fail-fast? I would expect this to conflict with --fail-fast, not override it. (Unlike --no-fail-fast which is a clear negation.)

This may come from my lack of knowledge regarding best practices and common patterns/idioms in CLIs, but to me it isn't immediately obvious that --no-fail-fast should override --fail-fast either. The rationalization I made to myself as for why this was the case today went as follows:

  1. CLI flags should win out over settings in a config file. Doing so provides the best balance of being able to set preferred defaults while making those defaults easy to override at runtime.
  2. .config/nextest.toml exposes the fail-fast=(true/false) option. --no-fail-fast and --fail-fast flags exist to override fail-fast=true and fail-fast=false in the config file respectively.
  3. failing fast is the default behavior and therefore the fail-fast=true option in the config file and the --fail-fast flag are implied.
  4. fail-fast=false must override the default value (fail-fast=true), and similarly --no-fail-fast must override --fail-fast.

My argument for why --max-fail overrides --fail-fast is similarly that --fail-fast is implied as it is the default behavior, and therefore both --no-fail-fast and --max-fail should override it.

I'm open to being persuaded otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so --fail-fast and --no-fail-fast overriding each other is part of a long-standing CLI convention that in case of that kind of binary choice, the last argument wins. I feel like --max-fail is slightly different from them, though I do see the argument. Ultimately it's a judgment call -- if we change our mind in the future, going from conflicts to overrides is easy while going from overrides to conflicts is harder.

@@ -2420,6 +2437,7 @@ mod tests {
"cargo nextest run --no-run --no-fail-fast",
ArgumentConflict,
),
("cargo nextest run --no-run --max-fail 3", ArgumentConflict),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice -- thanks for doing this!

@@ -1982,14 +1982,15 @@ fn write_final_warnings(
if cancel_status == Some(CancelReason::TestFailure) {
writeln!(
writer,
"{}: {}/{} {} {} not run due to {} (run with {} to run all tests)",
"{}: {}/{} {} {} not run due to {} (run with {} to run all tests, or run with {})",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you reword this to:

specify --no-fail-fast to run all tests, or --max-fail

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack!

Comment on lines 2081 to 2083
let fail_cancel = self
.max_fail
.map_or(false, |mf| self.run_stats.failed >= mf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so failed is one of several kinds of test errors -- there's also timed-out and success. I would add a non_success_count method to RunStats that sums them up and use that here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's also timed-out and success.

Do you mean timed-out and exec_errors? Regardless, I agree this a good idea. Will do!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sorry, I meant timed-out and exec errors. Thanks for catching that!

@@ -187,6 +187,9 @@ cargo nextest run -E 'platform(host)'
`--no-fail-fast`
: Do not exit the test run on the first failure. Most useful for CI scenarios.

`--max-fail`
: Number of tests that can fail before aborting the test run. Useful for uncovering multiple issues without having to run the whole test suite. Mutually exclusive with `--no-fail-fast`
Copy link
Member

@sunshowers sunshowers Nov 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggested changes here:

  1. Move --max-fail to above --no-fail-fast, and write it out as --max-fail=N.
  2. I'd just drop the "mutually exclusive" bit, it's something people can discover for themselves :) Instead, to --no-fail-fast you can add something like "Equivalent to --max-fail=1."
  3. Could you tag this with <!-- md:version 0.9.86 --> like --run-ignored only below? (We should have a tag for "unreleased" at some point but just saying the next version is okay for now).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack on all counts!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misread point 2. I think you meant to say either

  1. --no-fail-fast is equivalent to --max-fail=all
  2. --fail-fast is equivalent to --max-fail=1

I think both are worth saying somewhere. What do you think of doing the following?

  1. Modify --no-fail-fast entry to say

Do not exit the test run on the first failure. Most useful for CI scenarios. Equivalent to --max-fail=all

  1. Modify --max-fail description to say

Number of tests that can fail before aborting the test run. Useful for uncovering multiple issues without having to run the whole test suite. --fail-fast is equivalent to --max-fail=1

Instead of (2), could instead add an entry under Other runner options for --fail-fast and say it's equivalent to --max-fail=1


use crate::errors::MaxFailParseError;

/// Type for the max-fail flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put this file inside config? That's where this usually lives (and would align with future work to make config support max-fail as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call!

@AJamesyD AJamesyD force-pushed the feat/max-fail branch 2 times, most recently from 2447db3 to 7eb4c44 Compare December 3, 2024 22:10
Copy link
Member

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines 1620 to +1642
pub fn has_failures(&self) -> bool {
self.setup_scripts_failed > 0
|| self.setup_scripts_exec_failed > 0
|| self.setup_scripts_timed_out > 0
|| self.failed > 0
|| self.exec_failed > 0
|| self.timed_out > 0
self.failed_setup_script_count() > 0 || self.failed_count() > 0
}

/// Returns count of setup scripts that did not pass.
pub fn failed_setup_script_count(&self) -> usize {
self.setup_scripts_failed + self.setup_scripts_exec_failed + self.setup_scripts_timed_out
}

/// Returns count of tests that did not pass.
pub fn failed_count(&self) -> usize {
self.failed + self.exec_failed + self.timed_out
}

/// Summarizes the stats as an enum at the end of a test run.
pub fn summarize_final(&self) -> FinalRunStats {
// Check for failures first. The order of setup scripts vs tests should not be important,
// though we don't assert that here.
if self.setup_scripts_failed > 0
|| self.setup_scripts_exec_failed > 0
|| self.setup_scripts_timed_out > 0
{
if self.failed_setup_script_count() > 0 {
FinalRunStats::Failed(RunStatsFailureKind::SetupScript)
} else if self.setup_scripts_initial_count > self.setup_scripts_finished_count {
FinalRunStats::Cancelled(RunStatsFailureKind::SetupScript)
} else if self.failed > 0 || self.exec_failed > 0 || self.timed_out > 0 {
} else if self.failed_count() > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful, thanks for this cleanup!

Comment on lines +2083 to +2085
let fail_cancel = self
.max_fail
.map_or(false, |mf| self.run_stats.failed_count() >= mf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually, one change required here -- I think there might be an off-by-one issue here.

What if --max-fail=0 is passed in? I would expect this to mean "cancel test run after the first failure". But that seems to be --max-fail=1, and --max-fail=0 would I think always trigger this condition.

Looking at pytest, though, it seems like they mean --max-fail=2 to mean "cancel test run after 2 failures".

The pytest source code seems to suggest that they treat --max-fail=0 as --no-fail-fast.

So we have a few options:

  1. Do what pytest does. Remove --max-fail=all, use --max-fail=0 to indicate --no-fail-fast, and keep everything else the same.
  2. Use > rather than >= here, so shift meaning by 1.
  3. Make --max-fail=0 the same as 1.
  4. Ban --max-fail=0, require that the minimum be 1.

Out of these, 1 and 4 are the most appealing. I think 1 makes the most sense just for similarity with pytest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should already be doing 4 in the impl FromStr for MaxFail in max_fail.rs. I should definitely add tests though, update coming soon

Copy link
Member

@sunshowers sunshowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update your PR with 1 or 4 (I'd lean towards 1 but your call), and also add a test for this?

@sunshowers
Copy link
Member

Looks great, thanks!

@sunshowers sunshowers merged commit dbce2e6 into nextest-rs:main Dec 4, 2024
17 of 18 checks passed
@AJamesyD
Copy link
Contributor Author

AJamesyD commented Dec 5, 2024

Thank you! Pleasure doing business 🤝

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants