Update --dry-run output #272

giovanni-guidini · 2023-09-29T16:54:54Z

These changes update the output of --dry-run option so that:

Users can get a list of tests that are being skipped
Users have the option to parse the output from JSON

⚠️

The JSON output doesn't include the runner options in the tests to run list.
Instead there's the 'runner_options' key with that info.
This is to make sharding the list of tests easier

It's considerably more output, but given that we don't have clear guidelines on the
best way to do things I think it's better to provide all options.
Eventually adoption will converge to some way or another and we prioritize that.

related to codecov/engineering-team#555 These changes update the output of `--dry-run` option so that: 1. Users can get a list of tests that are being skipped 2. Users have the option to parse the output from JSON > ⚠️ > > The JSON output doesn't include the runner options in the tests to run list. > Instead there's the 'runner_options' key with that info. > This is to make sharding the list of tests easier It's considerably more output, but given that we don't have clear guidelines on the best way to do things I think it's better to provide all options. Eventually adoption will converge to some way or another and we prioritize that.

codecov · 2023-09-29T16:56:32Z

Codecov Report

Merging #272 (2a9fb88) into master (01986b6) will increase coverage by 0.01%.
Report is 1 commits behind head on master.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #272      +/-   ##
==========================================
+ Coverage   94.95%   94.97%   +0.01%     
==========================================
  Files          78       78              
  Lines        2657     2667      +10     
==========================================
+ Hits         2523     2533      +10     
  Misses        134      134

Flag	Coverage Δ
python3.10	`95.29% <100.00%> (+0.01%)`	⬆️
python3.11	`95.29% <100.00%> (+0.01%)`	⬆️
python3.8	`95.29% <100.00%> (+0.01%)`	⬆️
python3.9	`95.29% <100.00%> (+0.01%)`	⬆️
smart-labels	`94.96% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
codecov_cli/commands/labelanalysis.py	`96.21% <100.00%> (+0.31%)`	⬆️

matt-codecov

wait, so, we print a bunch of non-json info, and then we also print the json info? should that be a separate argument? like --dry-run --format=json? otherwise you have to copypaste or write logic to process the output

if this is still beta and we reserve the right to change the output format, won't block on that. but i think it's tidier to let the user say "just put the json (or envvar format) in stdout and i'll pipe it"

matt-codecov · 2023-09-29T17:13:26Z

tests/commands/test_invoke_labelanalysis.py

+                print(result)
+                print(result.output)


maybe clean the prints up, but there are other tests that leave em in so idc

The prints only show up if the test fails. In this case they are very useful to debug why the test failed. That's why some tests leave them there

giovanni-guidini · 2023-09-29T17:23:40Z

wait, so, we print a bunch of non-json info, and then we also print the json info? should that be a separate argument? like --dry-run --format=json? otherwise you have to copypaste or write logic to process the output

if this is still beta and we reserve the right to change the output format, won't block on that. but i think it's tidier to let the user say "just put the json (or envvar format) in stdout and i'll pipe it"

It is in beta and we do reserve the right to change the output format.
I agree that it's tidier that way, but we're still discovering which option is the best (which will eventually be the default one) and what this interface is going to be.

Right now the plan is "put all cards on the table".
In any case the output in stdout is dirty (meaning there are logs prior to the dry-run output) so there is some level of shell-fu required to clean it and use it anyway.

ghost

Currently you have:

if not dry_run:
    runner.process_labelanalysis_result(request_result)
else:
    _dry_run_output(...)

I recommend thinking about / modelling it a different way: categorizing the tests and running the tests are separate and equally important actions. Chaining list+run together can be the default, for simple use cases, but providing the ability to me to process the data between list and run is very powerful.

I'd model as three subcommands:

automated-test-selection, ats -- based on your diff and past coverage, run the relevant tests
      categorize -- leverage automated test selection to generate test lists
      run -- run a automated-test-selection test list

The parent subcommand would be implemented as

@click.command()
def automated_test_selection():
  testlist = ats.list()
  return ats.run(testlist)

ghost · 2023-09-29T20:53:28Z

tests/commands/test_invoke_labelanalysis.py

+                    assert fd.readlines() == ['--labels "test_present"\n']
+                with open(json_output, "r") as fd:
+                    assert fd.readlines() == [
+                        '{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'


Is there any reason to doubly-quote the the test names? Presumably there's no test named "test_absent" (including the quotes).

This looks over-quoted, to me.

We recently had a customer run into trouble because one of their parametrized tests looked something like path/to/test::TestClass::test_name[option -or another]. When those were piped into pytest it assumed the -or was a command line option for pytest itself and failed.

That's why we are now double quoting all labels.

ghost · 2023-09-29T20:59:59Z

tests/commands/test_invoke_labelanalysis.py

+                    assert fd.readlines() == [
+                        '{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'
                    ]


This represents the intended value more clearly, and should(?) yield better failure messages. It has the added side-effect of ignoring any whitespace changes, which I don't believe you intend to care about in this particlar test.

Suggested change

assert fd.readlines() == [

'{"runner_options": ["--labels"], "ats_tests_to_run": ["\\"test_absent\\"", "\\"test_global\\"", "\\"test_in_diff\\""], "ats_tests_to_skip": ["\\"test_present\\""]}\n'

]

assert json.load(fd) == {

"runner_options": ["--labels"],

"ats_tests_to_run": [

"\"test_absent\"",

"\"test_global\"",

"\"test_in_diff\""

],

"ats_tests_to_skip": [

"\"test_present\"",

]

}

The included quotes are more clearly pretty weird, here.

ghost · 2023-09-29T21:02:01Z

codecov_cli/commands/labelanalysis.py

+    help=(
+        "Print list of tests to run AND tests skipped (and options that need to be added to the test runner) to stdout. "
+        + "Also prints the same information in JSON format. "
+        + "JSON will have keys 'ats_tests_to_run', 'ats_tests_to_skip' and 'runner_options'. "


I think most users (on small-to-medium projects) will see "tests-to-skip" as unecessary noise.
I'd expect that it would be added to the output on a (default false) option; --show-tests-skipped, perhaps.

Entirely up to you. I'm happy either way.

I agree with you. I think that will eventually be the case.

As I mentioned though, while the product is still in beta we are outputting everything to understand what is indeed considered "noise", and what usage patterns arise. Those datapoints will help us define better interfaces in the future.

ghost · 2023-09-29T21:04:53Z

codecov_cli/commands/labelanalysis.py

+    click.echo(json.dumps(output_as_dict))
+    click.echo(
+        f"ATS_TESTS_TO_RUN={' '.join(runner.dry_run_runner_options + labels_run_wrapped_double_quotes)}"
+    )
+    click.echo(
+        f"ATS_TESTS_TO_SKIP={' '.join(runner.dry_run_runner_options + labels_skip_wrapped_double_quotes)}"
+    )


This stdout is nearly but not quite json parseable.
Most tools will do one of two things:

print json on stdout, and non-json on stderr

print a human-friendly form by default and only json if given --json

If you choose one of these, the --dry-run-output-path option can go away: tee(1) fills that need, then.

ghost · 2023-09-29T21:09:51Z

codecov_cli/commands/labelanalysis.py

@@ -55,13 +56,23 @@
 @click.option(
    "--dry-run",
    "dry_run",


It's not clear (to me) what label-analysis does on a "wet" run. Add a function docstring or else pass @command(help=...) to help users understand.

I suppose you are right... possibly we should have not built it this way, but created a new runner to do what the dry-run option does. This would reinforce the intended way of using it

giovanni-guidini · 2023-10-02T12:37:41Z

I recommend thinking about / modelling it a different way: categorizing the tests and running the tests are separate and equally important actions. Chaining list+run together can be the default, for simple use cases, but providing the ability to me to process the data between list and run is very powerful.

That is an interesting suggestion. We do want this tool be flexible and give a lot of power to the users on how to do things their way. The solution we came to was to bake that power and flexibility into it - the ability for you to process the data however you like.

This is done with runners. You can create your own and do whatever you want with the output of ATS. That is the reason that a runner is given the "raw output" of all lists.

I think it's important that we have this "plugin-like" structure for the adapter between ATS and the user's code if we want ATS to support many languages and testing tools.

giovanni-guidini requested review from scott-codecov, dana-yaish and matt-codecov September 29, 2023 16:55

matt-codecov approved these changes Sep 29, 2023

View reviewed changes

giovanni-guidini merged commit 4be3f77 into master Sep 29, 2023
10 checks passed

giovanni-guidini deleted the gio/change-dry-run-output branch September 29, 2023 17:23

ghost reviewed Sep 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update --dry-run output #272

Update --dry-run output #272

giovanni-guidini commented Sep 29, 2023

codecov bot commented Sep 29, 2023 •

edited

Loading

matt-codecov left a comment

matt-codecov Sep 29, 2023

giovanni-guidini Sep 29, 2023

giovanni-guidini commented Sep 29, 2023

ghost left a comment •

edited by ghost

Loading

ghost Sep 29, 2023

giovanni-guidini Oct 2, 2023

ghost Sep 29, 2023

ghost Sep 29, 2023

giovanni-guidini Oct 2, 2023

ghost Sep 29, 2023

ghost Sep 29, 2023

giovanni-guidini Oct 2, 2023

giovanni-guidini commented Oct 2, 2023

Update --dry-run output #272

Update --dry-run output #272

Conversation

giovanni-guidini commented Sep 29, 2023

codecov bot commented Sep 29, 2023 • edited Loading

Codecov Report

matt-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

giovanni-guidini commented Sep 29, 2023

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

giovanni-guidini commented Oct 2, 2023

codecov bot commented Sep 29, 2023 •

edited

Loading

ghost left a comment •

edited by ghost

Loading