Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update --dry-run output #272

Merged
merged 1 commit into from
Sep 29, 2023
Merged

Conversation

giovanni-guidini
Copy link
Contributor

related to codecov/engineering-team#555

These changes update the output of --dry-run option so that:

  1. Users can get a list of tests that are being skipped
  2. Users have the option to parse the output from JSON

⚠️

The JSON output doesn't include the runner options in the tests to run list.
Instead there's the 'runner_options' key with that info.
This is to make sharding the list of tests easier

It's considerably more output, but given that we don't have clear guidelines on the
best way to do things I think it's better to provide all options.
Eventually adoption will converge to some way or another and we prioritize that.

related to codecov/engineering-team#555

These changes update the output of `--dry-run` option so that:
1. Users can get a list of tests that are being skipped
2. Users have the option to parse the output from JSON

> ⚠️
>
>  The JSON output doesn't include the runner options in the tests to run list.
>  Instead there's the 'runner_options' key with that info.
>  This is to make sharding the list of tests easier

It's considerably more output, but given that we don't have clear guidelines on the
best way to do things I think it's better to provide all options.
Eventually adoption will converge to some way or another and we prioritize that.
@codecov
Copy link

codecov bot commented Sep 29, 2023

Codecov Report

Merging #272 (2a9fb88) into master (01986b6) will increase coverage by 0.01%.
Report is 1 commits behind head on master.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #272      +/-   ##
==========================================
+ Coverage   94.95%   94.97%   +0.01%     
==========================================
  Files          78       78              
  Lines        2657     2667      +10     
==========================================
+ Hits         2523     2533      +10     
  Misses        134      134              
Flag Coverage Δ
python3.10 95.29% <100.00%> (+0.01%) ⬆️
python3.11 95.29% <100.00%> (+0.01%) ⬆️
python3.8 95.29% <100.00%> (+0.01%) ⬆️
python3.9 95.29% <100.00%> (+0.01%) ⬆️
smart-labels 94.96% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
codecov_cli/commands/labelanalysis.py 96.21% <100.00%> (+0.31%) ⬆️

Copy link
Contributor

@matt-codecov matt-codecov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, so, we print a bunch of non-json info, and then we also print the json info? should that be a separate argument? like --dry-run --format=json? otherwise you have to copypaste or write logic to process the output

if this is still beta and we reserve the right to change the output format, won't block on that. but i think it's tidier to let the user say "just put the json (or envvar format) in stdout and i'll pipe it"

Comment on lines +346 to +347
print(result)
print(result.output)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clean the prints up, but there are other tests that leave em in so idc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prints only show up if the test fails. In this case they are very useful to debug why the test failed. That's why some tests leave them there

@giovanni-guidini
Copy link
Contributor Author

wait, so, we print a bunch of non-json info, and then we also print the json info? should that be a separate argument? like --dry-run --format=json? otherwise you have to copypaste or write logic to process the output

if this is still beta and we reserve the right to change the output format, won't block on that. but i think it's tidier to let the user say "just put the json (or envvar format) in stdout and i'll pipe it"

It is in beta and we do reserve the right to change the output format.
I agree that it's tidier that way, but we're still discovering which option is the best (which will eventually be the default one) and what this interface is going to be.

Right now the plan is "put all cards on the table".
In any case the output in stdout is dirty (meaning there are logs prior to the dry-run output) so there is some level of shell-fu required to clean it and use it anyway.

@giovanni-guidini giovanni-guidini merged commit 4be3f77 into master Sep 29, 2023
10 checks passed
@giovanni-guidini giovanni-guidini deleted the gio/change-dry-run-output branch September 29, 2023 17:23
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently you have:

if not dry_run:
    runner.process_labelanalysis_result(request_result)
else:
    _dry_run_output(...)

I recommend thinking about / modelling it a different way: categorizing the tests and running the tests are separate and equally important actions. Chaining list+run together can be the default, for simple use cases, but providing the ability to me to process the data between list and run is very powerful.

I'd model as three subcommands:

automated-test-selection, ats -- based on your diff and past coverage, run the relevant tests
      categorize -- leverage automated test selection to generate test lists
      run -- run a automated-test-selection test list

The parent subcommand would be implemented as

@click.command()
def automated_test_selection():
  testlist = ats.list()
  return ats.run(testlist)

assert fd.readlines() == ['--labels "test_present"\n']
with open(json_output, "r") as fd:
assert fd.readlines() == [
'{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to doubly-quote the the test names? Presumably there's no test named "test_absent" (including the quotes).

This looks over-quoted, to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently had a customer run into trouble because one of their parametrized tests looked something like path/to/test::TestClass::test_name[option -or another]. When those were piped into pytest it assumed the -or was a command line option for pytest itself and failed.

That's why we are now double quoting all labels.

Comment on lines +362 to 364
assert fd.readlines() == [
'{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This represents the intended value more clearly, and should(?) yield better failure messages. It has the added side-effect of ignoring any whitespace changes, which I don't believe you intend to care about in this particlar test.

Suggested change
assert fd.readlines() == [
'{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'
]
assert json.load(fd) == {
"runner_options": ["--labels"],
"ats_tests_to_run": [
"\"test_absent\"",
"\"test_global\"",
"\"test_in_diff\""
],
"ats_tests_to_skip": [
"\"test_present\"",
]
}

The included quotes are more clearly pretty weird, here.

help=(
"Print list of tests to run AND tests skipped (and options that need to be added to the test runner) to stdout. "
+ "Also prints the same information in JSON format. "
+ "JSON will have keys 'ats_tests_to_run', 'ats_tests_to_skip' and 'runner_options'. "
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most users (on small-to-medium projects) will see "tests-to-skip" as unecessary noise.
I'd expect that it would be added to the output on a (default false) option; --show-tests-skipped, perhaps.

Entirely up to you. I'm happy either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. I think that will eventually be the case.

As I mentioned though, while the product is still in beta we are outputting everything to understand what is indeed considered "noise", and what usage patterns arise. Those datapoints will help us define better interfaces in the future.

Comment on lines +364 to +370
click.echo(json.dumps(output_as_dict))
click.echo(
f"ATS_TESTS_TO_RUN={' '.join(runner.dry_run_runner_options + labels_run_wrapped_double_quotes)}"
)
click.echo(
f"ATS_TESTS_TO_SKIP={' '.join(runner.dry_run_runner_options + labels_skip_wrapped_double_quotes)}"
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stdout is nearly but not quite json parseable.
Most tools will do one of two things:

  1. print json on stdout, and non-json on stderr
  2. print a human-friendly form by default and only json if given --json

If you choose one of these, the --dry-run-output-path option can go away: tee(1) fills that need, then.

@@ -55,13 +56,23 @@
@click.option(
"--dry-run",
"dry_run",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear (to me) what label-analysis does on a "wet" run. Add a function docstring or else pass @command(help=...) to help users understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you are right... possibly we should have not built it this way, but created a new runner to do what the dry-run option does. This would reinforce the intended way of using it

@giovanni-guidini
Copy link
Contributor Author

I recommend thinking about / modelling it a different way: categorizing the tests and running the tests are separate and equally important actions. Chaining list+run together can be the default, for simple use cases, but providing the ability to me to process the data between list and run is very powerful.

That is an interesting suggestion. We do want this tool be flexible and give a lot of power to the users on how to do things their way. The solution we came to was to bake that power and flexibility into it - the ability for you to process the data however you like.

This is done with runners. You can create your own and do whatever you want with the output of ATS. That is the reason that a runner is given the "raw output" of all lists.

I think it's important that we have this "plugin-like" structure for the adapter between ATS and the user's code if we want ATS to support many languages and testing tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants