-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test] Enable uart_tx_rx_test for fpga sival environments #20698
Conversation
The cw310_sival_rom_ext configuration was still on the SAM3X bitstream variant. Move it to hyper310 to make all the SiVal I/O available. Signed-off-by: Alexander Williams <[email protected]>
They pass locally. Also, remove the cw310_test_rom environment from the list, as it doesn't support this test. Signed-off-by: Alexander Williams <[email protected]>
This test is a little flaky still, but I haven't worked out what the problem is: [jw opentitan]$ bazel test --runs_per_test=50 --cache_test_results=no //sw/device/tests:uart0_tx_rx_test_fpga_cw310_sival
[...]
//sw/device/tests:uart0_tx_rx_test_fpga_cw310_sival TIMEOUT in 1 out of 50 in 60.1s
Stats over 50 runs: max = 60.1s, min = 1.3s, avg = 2.8s, dev = 8.6s
/home/jw/.cache/bazel/_bazel_jw/7e172ccc9b1d5908d84fded9a2ecf9f4/execroot/lowrisc_opentitan/bazel-out/k8-fastbuild-ST-2cc462681f62/testlogs/sw/device/tests/uart0_tx_rx_test_fpga_cw310_sival/run_27_of_50/test.log
Executed 1 out of 1 test: 1 fails locally. The logs for the failed runs are always the same:
It seems the RX overflow interrupt isn't firing sometimes. I'm still debugging, but so far adding more synchronisation and sending way more data hasn't triggered the overflow interrupt yet. |
Ah, this test's code does not follow good practices for avoiding races, and it needs to stop using the SPI console (generic mode is going to be removed entirely). It will need some fixes... |
Do you know what changes are needed for proper synchronisation? I couldn't find the race when I was poking around this morning. All I found was that adding artificial delay to the data-processing loop improved pass rates, and adding delay to the ISR significantly reduced them. We used the SPI console to avoid having to use the UART for OTTF (since it will be under test), but we may be able to rework the test so that
Does generic mode mean using the SPI device raw (as opposed to something like its flash emulation mode)? We were thinking of using the flash emulation mode to implement RX (it currently only supports TX) but this would be synchronous and driven by the host which made everything more complicated and we abandoned that plan. |
For example: opentitan/sw/device/tests/uart_tx_rx_test.c Lines 468 to 471 in 143098e
This sequence is an improper use of wfi. The condition check for entering wfi and the call to wfi itself should not happen while interrupts are unmasked. It's not clear to me if this test could actually get wedged here, though, since the RX overflow interrupt should probably break it out while we're attempting to force it from the host side.
Yes. This mode is scheduled for deletion once the window for RTL changes opens. Also, note that every SPI mode is synchronous and driven by the host. However, generic mode allowed for simultaneous TX + RX communication ...not that hyperdebug's OCTOSPI peripheral supports it. If we bring in a USB console (like #19249), there's possibly that option too. However, usbdev handling in CI is a lot more "exciting" with the decision to hide / rename device nodes, and opentitanlib won't necessarily be able to find the correct device on its own. |
Thanks @a-will, that makes sense |
Actually, it looks like the test completes on the device side, but the message over the SPI console does not get communicated. I tried running in the RMA state, and during an apparent failure, gdb showed the test had returned from test_main() and was reporting the test status. Then, for the original sival conditions, I added a GPIO indicator (set pin to 0 on entry, set pin to 1 on test pass), and the GPIO also said the device thought the test passed. Ultimately, it seems like the problem is in the SPI console domain. |
Interesting, that makes sense since all my debugging was done through printf over the console. The only communication between the device and testrunner are a sync point before the data is sent and the final Could we use GPIO for those to run the test? We can leave the console enabled for whatever debugging value it provides, but at least the test will pass. Alternatively, we move to using another UART or USB as the console as you suggest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the changes to hw/top_earlgrey/BUILD are already merged, so they will be wiped when you rebase.
LGTM
They pass locally. Also, remove the cw310_test_rom environment from the list, as it doesn't support this test.
Note that this PR depends on #20696 to bring the fpga_cw310_sival_rom_ext config over to hyper310.