Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] Introduce a test orchestrator written in bazel #23505

Merged
merged 8 commits into from
Oct 1, 2024

Conversation

pamaury
Copy link
Contributor

@pamaury pamaury commented Jun 5, 2024

Prototype of the test orchestrator as discussed in the software WG. The implementation follows what we have discussed: opentitan_test calls a ci_orchetrator function with the list of all execution environments. This function then decides which environment(s) to run in CI and returns this list. Then opentitan_test marks every test not in that list as skip_in_ci.

This PR also removes the old complicated logic in the azure pipeline and fpga runner script. This is done by moving the FPGA test job to its own template file to avoid duplication.

@pamaury pamaury requested review from cfrantz, nbdd0121, moidx and jwnrt June 5, 2024 13:34
Copy link
Contributor

@jwnrt jwnrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you think would be the best place to document this?

I can imagine confusion if somebody adds a new test or execution environment and wonders why it's not running in CI

rules/opentitan/ci.bzl Show resolved Hide resolved
@jwnrt
Copy link
Contributor

jwnrt commented Jun 5, 2024

How should the orchestrator interact with the broken tag? Should it look for this and try to run the next available environment that isn’t broken?

@pamaury
Copy link
Contributor Author

pamaury commented Jun 5, 2024

How should the orchestrator interact with the broken tag? Should it look for this and try to run the next available environment that isn’t broken?

Yes that's my thinking (although it's not implemented yet). Otherwise it wiuld require a lot of manual fiddling for broken tests. Thanks for raising this point :)

@pamaury pamaury force-pushed the ci_orchestrator branch from 32f6857 to 64a53d6 Compare June 5, 2024 21:40
Copy link
Contributor

@engdoreis engdoreis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will approve when is marked as Ready to review.

rules/opentitan/ci.bzl Outdated Show resolved Hide resolved
@pamaury pamaury force-pushed the ci_orchestrator branch 8 times, most recently from 96a67c1 to 68c086b Compare June 7, 2024 08:27
rules/opentitan/ci.bzl Outdated Show resolved Hide resolved
azure-pipelines.yml Show resolved Hide resolved
@pamaury pamaury force-pushed the ci_orchestrator branch 2 times, most recently from 8ca0bf3 to b0b60ab Compare August 14, 2024 12:53
@pamaury
Copy link
Contributor Author

pamaury commented Aug 14, 2024

One of the issue I am running into is that it can happen that some jobs end up having no tests to run due to the orchestration. Unfortunately, bazel will error out if the test tag filters produces 0 tests and I don't think this behaviour can be turned off.

@jwnrt
Copy link
Contributor

jwnrt commented Aug 14, 2024

Is there a way to ask Bazel for a list of tests before running them? Then we can have a pre-step which exits early if there are none

@pamaury
Copy link
Contributor Author

pamaury commented Aug 14, 2024

Is there a way to ask Bazel for a list of tests before running them? Then we can have a pre-step which exits early if there are none

We can of course replace the tag filters by a query to create a list and then run it. The downside is that the test tag filter syntax is not supported by the query command (as far as I know) so it would require some parsing which is not ideal.

I can see one possible alternative: currently, even with the CI orchestrator, we still split the tests accross different jobs which is error-prone. We could make the CI orchestration a two-step process:

  • first tag some test with skip_in_ci as we currently do, to avoid running tests that bring no value
  • then we tag each non-skipped test with a unique tag to identify on which FPGA job it should run.
    This would effectively move the "job filtering" to the orchestrator which is easier to do in starlark.

The advantage would be that each CI job now becomes trivial to filter: all test with a specific tag, not skipped in ci and not broken. It would also ensure that we don't run duplicate tests accross (or forget some tests) jobs due to logic errors in the filtering.

@jwnrt
Copy link
Contributor

jwnrt commented Aug 14, 2024

That sounds good, but I had actually thought that was what your current implementation does. Each test is tagged by its execution environment right? And the Azure jobs run the tests with that exec env tag -broken,-skip_in_ci.

Wouldn’t this still have the same problem though? The job applies these filters but gets zero tests back and the job fails?

@pamaury
Copy link
Contributor Author

pamaury commented Aug 14, 2024

Currently it is not quite the case that each job corresponds to an execution environment because we have manufacturing tests that run in their own job.

To make it clear what I am suggesting, there would be bazel function:

def ci_fpga_job(env,tags):
   """Return the FPGA job on which this tests should run, based on its the execution environment
   and its tags.
  """
   if "cw340_sival_rom_ext" in tag and not "manuf" in tags:
      return "fpga_job_blabla"
   # and so on

and in rules/defs.bzl we could call this function to add the tag to the list. Then the azure file simply specifies the (unique) tag of each fpga job. For example we have some FPGA job filters like cw310_rom_with_fake_keys,cw310_rom_with_real_keys,-manuf.

Or maybe we should just keep one job per exec_env and call it a day :)

@nbdd0121
Copy link
Contributor

What's the reason that we currently run manufacturing tests as separate jobs? Probably just because we want to split into more jobs for parallelism?

@jwnrt
Copy link
Contributor

jwnrt commented Aug 14, 2024

I think it might be because they have custom execution environments since they run at unusual OTP stages (test_unlocked, rma, etc.). If we mix them into other tests, we might be reprogramming bitstreams more times in the middle of the run which will slow things down slightly.

@pamaury pamaury force-pushed the ci_orchestrator branch 3 times, most recently from 1b91773 to 5265492 Compare August 20, 2024 13:20
@pamaury pamaury force-pushed the ci_orchestrator branch 4 times, most recently from 036d729 to 1f4b9ba Compare August 27, 2024 13:13
Copy link
Contributor

@jwnrt jwnrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good bar the hyperdebug update problem and the broken tests. Very nice work.

ci/fpga-job.yml Outdated
# never detected. But if we run the tool another time, the device list is queried again
# and opentitantool can finish the update. The device will now reboot in normal mode
# and work for the hyperdebug job.
# FIXME: only execute for hyper310 interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if opentitanlib should accept update-firmware for all platforms but make it a no-op if not hyperdebug? Erroring seems a bit extreme.

Even then, I'm not sure how to manually specify --interface hyperdebug_dfu for these...

If the update step is really quick at detecting when no update is required, maybe we can move the update step into the commands for the execution environments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we discussed that in a previous PR and decided to do it later... This would be very convenient for this PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now implemented the update step: it only runs for the hyper310 interface. I will let the CI run to make sure that it works.

sw/device/tests/BUILD Outdated Show resolved Hide resolved
@pamaury
Copy link
Contributor Author

pamaury commented Aug 29, 2024

@cfrantz I have marked the relevant rom_ext e2e tests are broken in this PR. I suggest we mark them as unbroken when we backport your fixes in #24430

The commit fundamentally changes how tests are considered for running
in CI. Previously, any test not marked as "manual", "broken" and "skip_in_ci"
would be run in CI in all execution environment for which it is defined. We
then came up with some clever bazel queries to try and avoid running them
in multiple execution environment when not necessary but this is quite fragile.

The approach is very different: when creating the tests using opentitan_test,
we call a new function, called "ci_orchestrator" with the list of all exec_env
for which it is defined. This function then returns a subset of that list that
should be marked with skip_in_ci. The current implementation is that there is
a list of exec_env that are mutually-exclusive (_ONLY_RUN_ONE_IN_CI_SORTED)
and all but one will be skipped: the only one to run will the highest one in
the list. With this approach, we are guaranteed to never run duplicate tests.
If more complicated logic is required, it can always be added to this function.

As a byproduct of this, we can significantly simplify the filtering logic when
running FPGS jobs.

Signed-off-by: Amaury Pouly <[email protected]>
The --test_tag_filters option of `bazel test` is very convenient
but is only available for testing and not for querying. This script
takes as input a list of tags in the same fort as test_tag_filters
and runs an equivalent bazel query to list all tests that match.

Signed-off-by: Amaury Pouly <[email protected]>
@pamaury pamaury force-pushed the ci_orchestrator branch 3 times, most recently from 1042296 to 394531d Compare August 30, 2024 17:58
@rswarbrick rswarbrick removed their request for review August 31, 2024 08:26
@pamaury pamaury changed the title [WIP][ci] Introduce a test orchestrator written in bazel [ci] Introduce a test orchestrator written in bazel Sep 26, 2024
azure-pipelines.yml Outdated Show resolved Hide resolved
Comment on lines +78 to +87
ci/scripts/run-bazel-test-query.sh \
"${{ parameters.target_pattern_file }}" \
"${{ parameters.tag_filters }}",-manual,-broken,-skip_in_ci \
//... @manufacturer_test_hooks//...
# Run FPGA tests.
if [ -s "${{ parameters.target_pattern_file }}" ]; then
ci/scripts/run-fpga-tests.sh "${{ parameters.interface }}" "${{ parameters.target_pattern_file }}" || { res=$?; echo "To reproduce failures locally, follow the instructions at https://opentitan.org/book/doc/getting_started/setup_fpga.html#reproducing-fpga-ci-failures-locally"; exit "${res}"; }
else
echo "No tests to run after filtering"
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this logic be part of run-fpga-tests.sh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not because this way, we have two scripts for two different purposes: run-bazel-test-query does the query, and run-fpga-tests run the tests. You can more easily repurpose the scripts this way. Or I would make it third script if you really want to (run-fpga-tests-query).

Those tests can run in the ROM_EXT stage which should be preferred.

Signed-off-by: Amaury Pouly <[email protected]>
Previously we only tagged then with the exec_env name but for the CI
it is convenient to have a global way of querying all FPGA tests.

Signed-off-by: Amaury Pouly <[email protected]>
With the filter tags filtering logic, it is very easily to create a
situation where some tests are run in several jobs, or to forget running
tests because of a non-exhaustive set of filters.

This commit adds a final CI job that checks the above condition. This
is achieved by downaloding the list of tests run by every job and then
making sure that there are no duplicates. We also check for missing tests
by running a query on all non-broken, non-manual, non-skip_in_ci FPGA tests.

Signed-off-by: Amaury Pouly <[email protected]>
Those tests require the host interface to support serial break but
the CW310 used in the CI do not support it so they always fail. These
tests will need to be changed to the hyperdebug interface.

Signed-off-by: Amaury Pouly <[email protected]>
The test rv_dm_mem_access_dev_rv_dm_delayed_enabled is broken.

Signed-off-by: Amaury Pouly <[email protected]>
@pamaury pamaury merged commit c5787fb into lowRISC:master Oct 1, 2024
43 checks passed
@jwnrt jwnrt added the CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0 label Oct 24, 2024
Copy link

Successfully created backport PR for earlgrey_1.0.0:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants