-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ci] Introduce a test orchestrator written in bazel #23505
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you think would be the best place to document this?
I can imagine confusion if somebody adds a new test or execution environment and wonders why it's not running in CI
How should the orchestrator interact with the |
Yes that's my thinking (although it's not implemented yet). Otherwise it wiuld require a lot of manual fiddling for broken tests. Thanks for raising this point :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, will approve when is marked as Ready to review.
96a67c1
to
68c086b
Compare
8ca0bf3
to
b0b60ab
Compare
One of the issue I am running into is that it can happen that some jobs end up having no tests to run due to the orchestration. Unfortunately, bazel will error out if the test tag filters produces 0 tests and I don't think this behaviour can be turned off. |
Is there a way to ask Bazel for a list of tests before running them? Then we can have a pre-step which exits early if there are none |
We can of course replace the tag filters by a query to create a list and then run it. The downside is that the test tag filter syntax is not supported by the query command (as far as I know) so it would require some parsing which is not ideal. I can see one possible alternative: currently, even with the CI orchestrator, we still split the tests accross different jobs which is error-prone. We could make the CI orchestration a two-step process:
The advantage would be that each CI job now becomes trivial to filter: all test with a specific tag, not skipped in ci and not broken. It would also ensure that we don't run duplicate tests accross (or forget some tests) jobs due to logic errors in the filtering. |
That sounds good, but I had actually thought that was what your current implementation does. Each test is tagged by its execution environment right? And the Azure jobs run the tests with that exec env tag -broken,-skip_in_ci. Wouldn’t this still have the same problem though? The job applies these filters but gets zero tests back and the job fails? |
Currently it is not quite the case that each job corresponds to an execution environment because we have manufacturing tests that run in their own job. To make it clear what I am suggesting, there would be bazel function: def ci_fpga_job(env,tags):
"""Return the FPGA job on which this tests should run, based on its the execution environment
and its tags.
"""
if "cw340_sival_rom_ext" in tag and not "manuf" in tags:
return "fpga_job_blabla"
# and so on and in Or maybe we should just keep one job per exec_env and call it a day :) |
What's the reason that we currently run manufacturing tests as separate jobs? Probably just because we want to split into more jobs for parallelism? |
I think it might be because they have custom execution environments since they run at unusual OTP stages ( |
1b91773
to
5265492
Compare
036d729
to
1f4b9ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good bar the hyperdebug update problem and the broken tests. Very nice work.
ci/fpga-job.yml
Outdated
# never detected. But if we run the tool another time, the device list is queried again | ||
# and opentitantool can finish the update. The device will now reboot in normal mode | ||
# and work for the hyperdebug job. | ||
# FIXME: only execute for hyper310 interface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if opentitanlib
should accept update-firmware
for all platforms but make it a no-op if not hyperdebug? Erroring seems a bit extreme.
Even then, I'm not sure how to manually specify --interface hyperdebug_dfu
for these...
If the update step is really quick at detecting when no update is required, maybe we can move the update step into the commands for the execution environments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we discussed that in a previous PR and decided to do it later... This would be very convenient for this PR!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have now implemented the update step: it only runs for the hyper310
interface. I will let the CI run to make sure that it works.
5cfc401
to
cd585a2
Compare
The commit fundamentally changes how tests are considered for running in CI. Previously, any test not marked as "manual", "broken" and "skip_in_ci" would be run in CI in all execution environment for which it is defined. We then came up with some clever bazel queries to try and avoid running them in multiple execution environment when not necessary but this is quite fragile. The approach is very different: when creating the tests using opentitan_test, we call a new function, called "ci_orchestrator" with the list of all exec_env for which it is defined. This function then returns a subset of that list that should be marked with skip_in_ci. The current implementation is that there is a list of exec_env that are mutually-exclusive (_ONLY_RUN_ONE_IN_CI_SORTED) and all but one will be skipped: the only one to run will the highest one in the list. With this approach, we are guaranteed to never run duplicate tests. If more complicated logic is required, it can always be added to this function. As a byproduct of this, we can significantly simplify the filtering logic when running FPGS jobs. Signed-off-by: Amaury Pouly <[email protected]>
The --test_tag_filters option of `bazel test` is very convenient but is only available for testing and not for querying. This script takes as input a list of tags in the same fort as test_tag_filters and runs an equivalent bazel query to list all tests that match. Signed-off-by: Amaury Pouly <[email protected]>
1042296
to
394531d
Compare
394531d
to
fe8d7ac
Compare
ci/scripts/run-bazel-test-query.sh \ | ||
"${{ parameters.target_pattern_file }}" \ | ||
"${{ parameters.tag_filters }}",-manual,-broken,-skip_in_ci \ | ||
//... @manufacturer_test_hooks//... | ||
# Run FPGA tests. | ||
if [ -s "${{ parameters.target_pattern_file }}" ]; then | ||
ci/scripts/run-fpga-tests.sh "${{ parameters.interface }}" "${{ parameters.target_pattern_file }}" || { res=$?; echo "To reproduce failures locally, follow the instructions at https://opentitan.org/book/doc/getting_started/setup_fpga.html#reproducing-fpga-ci-failures-locally"; exit "${res}"; } | ||
else | ||
echo "No tests to run after filtering" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this logic be part of run-fpga-tests.sh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not because this way, we have two scripts for two different purposes: run-bazel-test-query does the query, and run-fpga-tests run the tests. You can more easily repurpose the scripts this way. Or I would make it third script if you really want to (run-fpga-tests-query).
fe8d7ac
to
1122705
Compare
Signed-off-by: Amaury Pouly <[email protected]>
Those tests can run in the ROM_EXT stage which should be preferred. Signed-off-by: Amaury Pouly <[email protected]>
Previously we only tagged then with the exec_env name but for the CI it is convenient to have a global way of querying all FPGA tests. Signed-off-by: Amaury Pouly <[email protected]>
With the filter tags filtering logic, it is very easily to create a situation where some tests are run in several jobs, or to forget running tests because of a non-exhaustive set of filters. This commit adds a final CI job that checks the above condition. This is achieved by downaloding the list of tests run by every job and then making sure that there are no duplicates. We also check for missing tests by running a query on all non-broken, non-manual, non-skip_in_ci FPGA tests. Signed-off-by: Amaury Pouly <[email protected]>
Those tests require the host interface to support serial break but the CW310 used in the CI do not support it so they always fail. These tests will need to be changed to the hyperdebug interface. Signed-off-by: Amaury Pouly <[email protected]>
The test rv_dm_mem_access_dev_rv_dm_delayed_enabled is broken. Signed-off-by: Amaury Pouly <[email protected]>
1122705
to
1c44deb
Compare
Successfully created backport PR for |
Prototype of the test orchestrator as discussed in the software WG. The implementation follows what we have discussed:
opentitan_test
calls aci_orchetrator
function with the list of all execution environments. This function then decides which environment(s) to run in CI and returns this list. Thenopentitan_test
marks every test not in that list asskip_in_ci
.This PR also removes the old complicated logic in the azure pipeline and fpga runner script. This is done by moving the FPGA test job to its own template file to avoid duplication.