Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[base,ottf,opentitanlib,test] Impl spi device ottf RX console #24345

Conversation

anthonychen1251
Copy link
Member

This PR has three commits that:

  1. Implement a new spi_device Rx console by using the SPI device flash mode feature.
  2. Update the opentitanlib SPI console module (added in#17817) to support OTTF spi_device console by using SPI device flash mode feature instead of the removed generic mode .
  3. Add a test for the new OTTF spi_device console.

This PR addresses #21726 partially. The TX (input) direction will be added in another PR.

@anthonychen1251 anthonychen1251 requested review from a team as code owners August 16, 2024 10:23
@anthonychen1251 anthonychen1251 requested review from jwnrt and engdoreis and removed request for a team August 16, 2024 10:23
@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch from c484642 to 0703e30 Compare August 16, 2024 10:47
@timothytrippel timothytrippel requested review from timothytrippel and removed request for engdoreis August 16, 2024 17:45
@timothytrippel
Copy link
Contributor

Looks like you have some build errors. Trying seeing what they are locally by running: bazel build //sw/host/opentitanlib

@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch from 0703e30 to ff86aa2 Compare August 17, 2024 07:41
@anthonychen1251
Copy link
Member Author

Looks like you have some build errors. Trying seeing what they are locally by running: bazel build //sw/host/opentitanlib

Thanks for the instructions. Those build errors have been addressed in the new commits. Not sure why those errors didn't appear when I ran the newly added test for OTTF spi_device console though.

The spi_host_config_test failed in the most recent CI pipeline, but this test seems to be unrelated to the commits included in this PR.
https://dev.azure.com/lowrisc/opentitan/_build/results?buildId=158355&view=logs&j=d6ba1a9c-e7a8-5acf-e6fb-86a516d856bf&t=e16cc888-fc04-5078-0841-ecafa9afe0ed&l=559

I also tried to run this test locally
bazel test --test_output=streamed //sw/device/tests:spi_host_config_test_fpga_cw340_sival_rom_ext
and the result was passed.

@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch 2 times, most recently from f9154bb to badba4b Compare August 19, 2024 08:22
Copy link
Contributor

@engdoreis engdoreis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, we were look for forwarding for this feature.
I left a few comments below.

sw/device/lib/runtime/print.c Outdated Show resolved Hide resolved
sw/device/lib/runtime/print.c Outdated Show resolved Hide resolved
sw/device/lib/runtime/print.c Outdated Show resolved Hide resolved
@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch 4 times, most recently from 161d7c4 to 5a918c6 Compare August 20, 2024 08:32
@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch from 5a918c6 to 728fdde Compare August 21, 2024 06:07
@timothytrippel timothytrippel self-requested a review August 21, 2024 06:23
Copy link
Contributor

@timothytrippel timothytrippel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more minor improvements, otherwise this looks great! Thanks @anthonychen1251 . Once you make the last couple improvements, and this passes CI. I can merge it.

sw/device/lib/testing/test_framework/ottf_console.c Outdated Show resolved Hide resolved
sw/device/tests/BUILD Show resolved Hide resolved
sw/device/tests/spi_device_ottf_console_test.c Outdated Show resolved Hide resolved
This implements a new spi_device RX console by using the SPI device
flash mode feature. The flash read buffer is used as a circular buffer.

Reuses framing protocol from lowRISC#17494 with minor modifications to
distinguish fresh data from stale values in the read buffer.

Signed-off-by: Anthony Chen <[email protected]>
This updates the console module to use SPI device flash mode feature to
support SPI device console. It's currently a one way console.

Signed-off-by: Anthony Chen <[email protected]>
@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch from 728fdde to 36406c9 Compare August 21, 2024 07:49
This tests the OTTF spi_device console, which uses the OT spi_device in
flash mode, along with the flash read buffer and a lightweight framing
protocol to implement a one-way (RX) console.

Signed-off-by: Anthony Chen <[email protected]>
@anthonychen1251 anthonychen1251 force-pushed the impl-spi-device-ottf-console-read-buffer branch from 36406c9 to c01bbc7 Compare August 21, 2024 08:19
@engdoreis
Copy link
Contributor

Thanks for the changes.

After a discussion with @jwnrt @HU90m, we came up with a suggestion for the protocol that we believe would simplify the implementation and be more robust, by leveraging the spi_flash features to control the data flow.

  1. The device would always write the header to the beginning of the buffer.
  2. The Host must read the status register to check the busy bit is cleared before issuing a kSpiDeviceFlashOpReadNormal command.
  3. After reading, the host must issue a kSpiDeviceFlashOpSectorErase to signal to the device that the frame has been read via the busy bit.
  4. The device must check that the busy bit is set before writing a new frame to the buffer.
  5. The device writes a new frame to the buffer and clear the busy bit.

Does that make sense?

One downside thought is that we could have a deadlock in the device if the host does not poll the device. But that can be sorted with a timeout.

@jwnrt
Copy link
Contributor

jwnrt commented Aug 21, 2024

One downside thought is that we could have a deadlock in the device if the host does not poll the device. But that can be sorted with a timeout.

Minor nit: I don't think this protocol would deadlock, it would just be blocked on the host doing its own thing

@timothytrippel
Copy link
Contributor

timothytrippel commented Aug 21, 2024

Thanks for the changes.

After a discussion with @jwnrt @HU90m, we came up with a suggestion for the protocol that we believe would simplify the implementation and be more robust, by leveraging the spi_flash features to control the data flow.

  1. The device would always write the header to the beginning of the buffer.
  2. The Host must read the status register to check the busy bit is cleared before issuing a kSpiDeviceFlashOpReadNormal command.
  3. After reading, the host must issue a kSpiDeviceFlashOpSectorErase to signal to the device that the frame has been read via the busy bit.
  4. The device must check that the busy bit is set before writing a new frame to the buffer.
  5. The device writes a new frame to the buffer and clear the busy bit.

Does that make sense?

One downside thought is that we could have a deadlock in the device if the host does not poll the device. But that can be sorted with a timeout.

This seems like it would lower the potential throughput always having to align the frame header at the beginning of the buffer no?

Let's get this merged before making any further improvements to it. It is blocking critical provisioning work.

@jwnrt
Copy link
Contributor

jwnrt commented Aug 21, 2024

This seems like it would lower the potential throughput always having to align the frame header at the beginning of the buffer no?

You mean because the current implementation can queue up multiple frames in the circular buffer without having to wait for the host to read them out?

@timothytrippel
Copy link
Contributor

timothytrippel commented Aug 21, 2024

Yes, can your proposal achieve the same?

@jwnrt
Copy link
Contributor

jwnrt commented Aug 21, 2024

If we're sending lots of small frames at once, then no. If it's large amounts of data filling the buffer then it should be the same.

@engdoreis
Copy link
Contributor

engdoreis commented Aug 21, 2024

I'm happy to merge as it is, and we can experiment other protocols later.

This removes 2 flash ECC error test cases that were brittle since the
test was self-corrupting itself (by design) in locations where test code
could be executing.

Signed-off-by: Tim Trippel <[email protected]>
@timothytrippel
Copy link
Contributor

Looks like the test that was failing was a flaky test that provided no additional value to the test scenario it was exercising so I added a commit to this PR that should fix the issue and get this to pass CI.

@timothytrippel timothytrippel merged commit e0a468f into lowRISC:master Aug 21, 2024
40 checks passed
@anthonychen1251
Copy link
Member Author

Thanks for the changes.

After a discussion with @jwnrt @HU90m, we came up with a suggestion for the protocol that we believe would simplify the implementation and be more robust, by leveraging the spi_flash features to control the data flow.

  1. The device would always write the header to the beginning of the buffer.
  2. The Host must read the status register to check the busy bit is cleared before issuing a kSpiDeviceFlashOpReadNormal command.
  3. After reading, the host must issue a kSpiDeviceFlashOpSectorErase to signal to the device that the frame has been read via the busy bit.
  4. The device must check that the busy bit is set before writing a new frame to the buffer.
  5. The device writes a new frame to the buffer and clear the busy bit.

Does that make sense?

One downside thought is that we could have a deadlock in the device if the host does not poll the device. But that can be sorted with a timeout.

Thanks for your thoughtful suggestions! Before creating this PR, I did experiment with a similar approach where the host issued an upload command to signal the device that the frame has been read. In this protocol, each data chunk transfer is a blocking operation, even if there is free space in the read buffer to accommodate the next chunk.

While the proposed protocol in this PR utilizes the LAST_READ_ADDR to coordinate Device and Host activities, so the device is able to push data chunks into the read buffer non-blocking and only block when there is insufficient space to accommodate the next chunk.

To illustrate this potential advantage, I conducted a simple experiment by logging a test message This is a SPI device OTTF console test. on the device side with varying iterations. While not a rigorous evaluation, the results suggest a throughput improvement with the non-blocking approach.

Iterations blocking (seconds) non-blocking (seconds) ratio
50 0.381s 0.215s 1.77
100 0.737s 0.426s 1.73
500 3.735s 2.241s 1.67
1000 7.496s 4.512s 1.66
5000 37.598s 22.478s 1.67

I acknowledge that the suggested protocol offers implementation simplicity. It seems there's a trade-off here between potential throughput performance and implementation simplicity/robustness.

Copy link

Backport failed for earlgrey_1.0.0, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin earlgrey_1.0.0
git worktree add -d .worktree/backport-24345-to-earlgrey_1.0.0 origin/earlgrey_1.0.0
cd .worktree/backport-24345-to-earlgrey_1.0.0
git switch --create backport-24345-to-earlgrey_1.0.0
git cherry-pick -x 4f8b2a75623dc5c52790ad469592433a25d6837c 0d95dfc9a5784115df75d3d267c352b8a6373dea c01bbc7c43c042d4389f570081d046c99898e314 39b3f76a0daf287ec257987d6994333240f1eef2

@github-actions github-actions bot added the Manually CherryPick This PR should be manually cherry picked. label Oct 22, 2024
@timothytrippel timothytrippel removed Manually CherryPick This PR should be manually cherry picked. CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0 labels Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants