[opentitantool] Introduce binary protocol for HyperDebug gpio monitoring #20672

jesultra · 2023-12-18T05:01:51Z

HyperDebug supports logic analyzer functionality, in which it will record events on a given set of gpio pins.
opentitantool can then later be used to retrieve a transcript of every level change with microsecond timestamp.

This has been used by the GSC team to verify the reaction time of firmware under test. Such testing involve typically a few handfuls of events, which can easily be transmitted via the textual protocol. However, we now plan on using the functionality for cases with 30000 events to be retrieved, which would take many tens of seconds to inefficently transmit via the console (which runs slow enough that the physical UART can keep up).

To improve performance, this CL introduces another Google-specific extension to the binary CMSIS-DAP protocol, for GPIO operations, and adds code to replicate the gpio monitoring read functionality. (Starting and stopping the monitoring can still only be done through the textual protocol, those do not carry a large amount of data. Though there may be a 80-character limit on a single command, which could impact the ability to monitor 5 or more signals at once, so in the future we may want to allow starting monitoring also through the binary protocol.)

Corresponding HyperDebug functionality implemented here:
https://chromium-review.googlesource.com/c/chromiumos/platform/ec/+/5128914

pamaury · 2024-01-09T15:08:28Z

Can you confirm our understanding of the USB protocol used here (it's hard to figure it out form the device code since I am not familiar with the hyperdebug code): the device will send one or more USB bulk packets (full speed so up to 64 bytes each). The first packet contains the header and some data, and the remaining packets contain the rest of the data?
Also, will the device send a ZLP if the total transfer length is a multiple of 64 bytes, or not? (I am assuming not since your code treats a ZLP as an error).

nbdd0121

The lowRISC software team reviewed this pull request together in a review session.

We have some detailed suggestions below, but also at the high level have some questions about how this is implemented.

Firstly this is missing comments. Please include comments detailing the protocol so that code readers do not have to get it from another source. If there is a canonical description of the protocol, please also include a link.

We are skeptical about the way the struct is organised. It's named header but it is clearly including more things than just the header, i.e. it contains a protocol identification byte and data bytes. The data bytes also do not reflect the actual maximum size, and if we understand correctly that's only for ensuring that USB bulk reads are not overshort.

Another thing is that this is a wire protocol, it needs to be conscious of the byte order. You can either use zerocopy's byte order primitives, or perhaps just parse the bytes using byteorder crate directly without creating a struct. Given the small number of fields it might even aid readability. Although, given that we don't really support BE machines, feel free to ignore this suggestion.

PS: In https://chromium-review.googlesource.com/c/chromiumos/platform/ec/+/5128914, gpio.c L1168, it converts tx_buffer + 1 (with alignment 1) to struct gpio_monitoring_header_t (with alignment 8) and is undefined behaviour.

nbdd0121 · 2024-01-09T13:28:08Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+                .borrow()
+                .write_bulk(cmsis_interface.out_endpoint, &pkt)?;
+
+            let mut resp = RspGpioMonitoringHeader::new();


Instead of creating a sentinel value and reading into it, create a buffer, read into it, and then use zerocopy's FromBytes::read_from to convert that buffer into a filled struct.

nbdd0121 · 2024-01-09T13:29:21Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            match resp.status {
+                0 => (),
+                n => bail!(TransportError::CommunicationError(format!(
+                    "HyperDebug error: {}",
+                    n
+                ))),
+            }


I think a simple if is easier to read here.

It just happens that I added handling for one specific error code, which deserves a descriptive error message, since it can be triggered by the operator, and not only by bugs in opentitantool/HyperDebug.

nbdd0121 · 2024-01-09T13:32:43Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+                    n
+                ))),
+            }
+            let skip_bytes = resp.struct_size as usize - (GPIO_MONITORIN_HEADER_SIZE - 1);


I would suggest removing the constant and use memoffset::offset_of!(RspGpioMonitoringHeader, data) when you need it.

Also, this code is not guarding against overflows when struct_size is smaller.

I have addressed both points.

nbdd0121 · 2024-01-09T13:38:16Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            let mut signal_bits = 0;
+            while (pin_names.len() - 1) >> signal_bits != 0 {
+                signal_bits += 1;
+            }


Suggested change

let mut signal_bits = 0;

while (pin_names.len() - 1) >> signal_bits != 0 {

signal_bits += 1;

}

let signal_bits = (pin_names.len() * 2 - 1).ilog2();

or

Suggested change

let mut signal_bits = 0;

while (pin_names.len() - 1) >> signal_bits != 0 {

signal_bits += 1;

}

let signal_bits = 32 - (pin_names.len() as u32 - 1).leading_zeros();

I chose the latter of the two suggestions, since I think the name of the method leading_zeros() explains everything to any casual reader, who like me may not be familiar with either ilog2() or leading_zeros().

nbdd0121 · 2024-01-09T13:40:45Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            let signal_mask = (1u64 << signal_bits) - 1;
+
+            let mut cur_time: u64 = resp.start_timestamp;
+            let mut cur_level = Vec::new();


Suggested change

let mut cur_level = Vec::new();

let mut cur_level = Vec::with_capacity(pin_names.len());

Although I think we could just be using bitmasks in the loop itself?

Yes, I chose Vec<bool> because I thought that the code in the loop would become cleaner, but it seems that it is causing more trouble than it is saving. I have changed the code to use a bitmask in the format of start_levels.

nbdd0121 · 2024-01-09T15:12:16Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            }
+            let skip_bytes = resp.struct_size as usize - (GPIO_MONITORIN_HEADER_SIZE - 1);
+
+            let mut databytes: Vec<u8> = Vec::new();


Suggested change

let mut databytes: Vec<u8> = Vec::new();

let mut data_bytes = Vec::with_capacity(resp.transcript_size as usize);

I have reworked the logic to do just a single resize().

nbdd0121 · 2024-01-09T15:13:00Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            let skip_bytes = resp.struct_size as usize - (GPIO_MONITORIN_HEADER_SIZE - 1);
+
+            let mut databytes: Vec<u8> = Vec::new();
+            databytes.extend_from_slice(&resp.data[..bytecount - GPIO_MONITORIN_HEADER_SIZE]);


You should just skip bytes here and not using it as starting index or loop condition later.

nbdd0121 · 2024-01-09T15:14:56Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+
+            while databytes.len() < skip_bytes + resp.transcript_size as usize {
+                let original_length = databytes.len();
+                databytes.resize(original_length + 64, 0u8);


Instead of growing and shrinking, perhaps resize to full expected size ahead of time and just keep filling it.

nbdd0121 · 2024-01-09T15:17:59Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+                // since the previous event (on any signal, not necessarily on that same one).
+                cur_time += value >> signal_bits;
+                events.push(MonitoringEvent {
+                    signal_index: (value & signal_mask) as u8,


(value & signal_mask) is used many times. Perhaps save it to a variable first.

nbdd0121 · 2024-01-09T15:18:33Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+                events.push(MonitoringEvent {
+                    signal_index: (value & signal_mask) as u8,
+                    edge: if cur_level[(value & signal_mask) as usize] {
+                        cur_level[(value & signal_mask) as usize] = false;


This could be a cur_level[index] ^= true; after the push, or, cur_level ^= 1 << index; if you opted to use bitmask instead of creating an array.

jesultra · 2024-01-10T00:17:33Z

Thank you for the very detailed review.

Can you confirm our understanding of the USB protocol used here (it's hard to figure it out form the device code since I am not familiar with the hyperdebug code): the device will send one or more USB bulk packets (full speed so up to 64 bytes each). The first packet contains the header and some data, and the remaining packets contain the rest of the data? Also, will the device send a ZLP if the total transfer length is a multiple of 64 bytes, or not? (I am assuming not since your code treats a ZLP as an error).

Yes, on the HyperDebug firmware side, I use a USB-queueing library also used with e.g. the UART USB interfaces. This library does not deal with "transfers" terminated with a short (possibly zero length) packet, but thinks of the data a a single stream, and may in theory send any number of bytes in USB packets, though as I enqueue bytes very quickly in this case, unlike a physical UART, they will practically always arrive in 64-byte packets except the last one. My code here is still prepared for short packets "in the middle" of the response, though.

In the future, I may choose to change the HyperDebug firmware to provide the usual transmission boundaries, if we find that desirable. I chose the above mostly because the USB streaming code library was there already, and is well tested, allowing me to spend time on the higher level functions.

Firstly this is missing comments. Please include comments detailing the protocol so that code readers do not have to get it from another source. If there is a canonical description of the protocol, please also include a link.

I do not have a document describing the protocol, only comments in the HyperDebug source code. I have added more comments in this file, and a reference to the HyperDebug file.

We are skeptical about the way the struct is organised. It's named header but it is clearly including more things than just the header, i.e. it contains a protocol identification byte and data bytes. The data bytes also do not reflect the actual maximum size, and if we understand correctly that's only for ensuring that USB bulk reads are not overshort.

Yes, I could not do read_bulk(size_of::<...>) without losing the remaning bytes from the first 64-byte packet being sent, which is why I added the extra data field to the header struct. Using FromBytes as you suggested is much cleaner.

Another thing is that this is a wire protocol, it needs to be conscious of the byte order. You can either use zerocopy's byte order primitives, or perhaps just parse the bytes using byteorder crate directly without creating a struct. Given the small number of fields it might even aid readability. Although, given that we don't really support BE machines, feel free to ignore this suggestion.

Yes, I used to be much more conscious of wire protocols, preferring big endian on the wires. But the EC codebase makes no such attempt, at it seems that BE machines are not common anymore, so I think I will ignore this issue.

nbdd0121

Thanks for updating. This looks much better now.

nbdd0121 · 2024-01-10T06:05:58Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+
+/// Read 7 bits from each byte, least significant byte first.  High bit of one indicates more
+/// bytes belong to the same value.
+fn decode_leb128(idx: &mut usize, databytes: &[u8]) -> u64 {


This function doesn't guard against data bytes ending with a byte with MSB set. We would like an error with meaning message instead of panic if that error happens.

You could switch this to take a &mut impl std::io::Read and then iterate over using std::io::Cursor::new(&databytes[header_bytes..]). It'll make the use site cleaner but it might make this function a bit more uglier (no sure if it worths the trade-off).

I have modified this routine to return Result<u64>, and for good measure made it not mutate idx in the case of errors.

pamaury · 2024-01-10T09:02:24Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+                vec![0u8; 1 + size_of::<RspGpioMonitoringHeader>() + USB_MAX_SIZE];
+            let mut bytecount = 0;
+
+            while bytecount < 1 + size_of::<RspGpioMonitoringHeader>() {


The header always fits in one packet. Would it make sense to change the code to first do an initial read_bulk to get the header + part of the data. With this, we now know the size of the full data so we can allocate a buffer for it and read it with several read_bulk?

From my interpretation of @jesultra's comment it sounds like they're using USB as a stream and doesn't guarantee the header'll be sent in a single packet?

Ah sorry, I missed @jesultra 's comment on the USB protocol.

nbdd0121

LGTM apart a final nit

nbdd0121 · 2024-01-11T10:20:27Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+    let mut shift = 0;
+    while i < databytes.len() {
+        let byte = databytes[i];
+        value |= ((byte & 0x7F) as u64) << shift;


This may overflow there are 10 bytes without MSB set.

I have inserted a check to return an error in case of a tenth byte in the encoding, meaning that the routine accepts at most 63 bits of data for an u64. I do not think it is worth supporting the most significant 64th bit being set through the lowest bit in the tenth byte of encoding, as it would make error handling more complicated.

The Ti50 team occasionally sees "communication error" from opentitansession, most often during `transport init`, but that could be just because there are dozens of GPIOs being manipulated in a short time. This PR aims to help a litle bit in diagnosing, by making each error message unique. Change-Id: I28e804429d153c1cd527ea467729364bc7c1ec15 Signed-off-by: Jes B. Klinke <[email protected]>

nbdd0121 · 2024-01-17T16:54:02Z

sw/host/opentitanlib/src/transport/hyperdebug/gpio.rs

+            *idx = i;
+            return Ok(value);
+        }
+        if shift + 7 > 64 {


shift is already incremented by 7 above?

Ah, this is to error when shift == 63

Yes, but even having a shift value of e.g. 60 for next round would allow an overflow, as a value of 0x7F shifted 60 bits would need 67 bits of storage. That is why this line ensures that you can shift any 7-bit value by shift bits, and it still does not exceed 64 bits.

pamaury · 2024-01-18T17:10:12Z

sw/host/opentitanlib/src/transport/hyperdebug/mod.rs

+            .read_bulk(cmsis_interface.in_endpoint, &mut resp)?;
+        let resp = &resp[..bytecount];
+        ensure!(
+            bytecount >= 4 && resp[0] == Self::CMSIS_DAP_CUSTOM_COMMAND_GOOGLE_INFO && resp[1] >= 2,


maybe document what the resp[1] >= 2 is? I assume that this is the number of bytes that follow the header?

When verifying the ITE bootloader waveform, 30000 events will be captured, which takes a long time to transmit inefficiently via the text console. This CL leverages the binary CMSIS-DAP protocol to allow an alternative way of retrieving a transcript of captured GPIO events. Corresponding functionality in opentitantool proposed here: lowRISC/opentitan#20672 BUG=b:266832220 TEST=make tast (with local changes to opentitantool, to use new protocol) Change-Id: Id8c472ef29100f1a89743575b1b7ec00aaef449b Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/platform/ec/+/5128914 Code-Coverage: Zoss <[email protected]> Commit-Queue: Jes Klinke <[email protected]> Reviewed-by: Jett Rink <[email protected]> Tested-by: Jes Klinke <[email protected]>

Current code assumes that any flavor of a HyperDebug debugger which implements the CMSIS-DAP protocol, will also implement the Google extensions for I2C host and device control. Allthough this is the case now, this CL introduces code to properly query for Google extensions. Change-Id: I48a88d6917935d45d716fff6c743f9b2f0a52202 Signed-off-by: Jes B. Klinke <[email protected]>

HyperDebug supports logic analyzer functionality, in which it will record events on a given set of gpio pins, and `opentitantool` can later be used to retrieve a transcript of every level change with microsecond timestamp. This has been used by the GSC team to verify the reaction time of firmware under test. Such testing involve typically a few handfuls of events, which can easily be transmitted via the textual protocol. However, We now plan on using the functionality for cases with 30000 events to be retrieved, which would take many tens of seconds to inefficently transmit via the console (which runs slow enough that the physical UART can keep up). To improve performance, this CL introduces another Google-specific extension to the binary CMSIS-DAP protocol, for GPIO operations, and adds code to repliate the `gpio monitoring read` functionality. (Starting and stopping the monitoring can still only be done through the textual protocol, those do not carry a large amount of data. Though there may be a 80-character limit on a single command, which could impact the ability to monitor 5 or more signals at once, so in the future we may want to allow starting monitoring also through the binary protocol.) Change-Id: I3c075f2960b4d4a38bff8cd7d8e270a3a1211a9b Signed-off-by: Jes B. Klinke <[email protected]>

Updated to new HyperDebug firmware with support for binary GPIO monitoring protocol through "vendor extension" to CMSIS-DAP. Change-Id: I29fb35c39e835e59763e0d0a935b087dd14ba2ae Signed-off-by: Jes B. Klinke <[email protected]>

github-actions · 2024-02-07T23:00:32Z

Successfully created backport PR for earlgrey_es_sival:

Cherry-pick to earlgrey_es_sival: [opentitantool] Introduce binary protocol for HyperDebug gpio monitoring #21251

jesultra added the SW:opentitantool label Dec 18, 2023

jesultra force-pushed the hyp_debug branch from 39fd002 to f7751a9 Compare December 18, 2023 18:23

jesultra changed the title ~~[opentitantool] Introduce binary protocol for HyperDebug gpio~~ [opentitantool] Introduce binary protocol for HyperDebug gpio monitoring Dec 18, 2023

jesultra force-pushed the hyp_debug branch from f7751a9 to fa4ed16 Compare December 18, 2023 20:58

jesultra requested review from cfrantz, pamaury, mundaym and jwnrt December 18, 2023 20:58

jesultra marked this pull request as ready for review December 18, 2023 21:01

jesultra requested a review from a team as a code owner December 18, 2023 21:01

nbdd0121 requested changes Jan 9, 2024

View reviewed changes

jesultra force-pushed the hyp_debug branch from fa4ed16 to d521f5b Compare January 9, 2024 23:42

jesultra force-pushed the hyp_debug branch from d521f5b to 6e747b3 Compare January 10, 2024 00:18

jesultra requested a review from nbdd0121 January 10, 2024 00:19

nbdd0121 reviewed Jan 10, 2024

View reviewed changes

pamaury reviewed Jan 10, 2024

View reviewed changes

jesultra force-pushed the hyp_debug branch from 6e747b3 to 6461da6 Compare January 10, 2024 18:51

jesultra requested a review from nbdd0121 January 10, 2024 18:52

jesultra force-pushed the hyp_debug branch from 6461da6 to 3b7efdb Compare January 10, 2024 19:54

nbdd0121 approved these changes Jan 11, 2024

View reviewed changes

jesultra force-pushed the hyp_debug branch from 3b7efdb to e35dd75 Compare January 16, 2024 17:51

nbdd0121 reviewed Jan 17, 2024

View reviewed changes

nbdd0121 approved these changes Jan 17, 2024

View reviewed changes

jesultra requested a review from pamaury January 18, 2024 15:01

pamaury reviewed Jan 18, 2024

View reviewed changes

pamaury approved these changes Jan 18, 2024

View reviewed changes

jesultra added the kokoro:rebuild label Jan 26, 2024

opentitan-github-bot removed the kokoro:rebuild label Jan 26, 2024

jesultra added 3 commits January 26, 2024 09:39

[opentitantool] Updated HyperDebug firmware

04ef809

Updated to new HyperDebug firmware with support for binary GPIO monitoring protocol through "vendor extension" to CMSIS-DAP. Change-Id: I29fb35c39e835e59763e0d0a935b087dd14ba2ae Signed-off-by: Jes B. Klinke <[email protected]>

jesultra force-pushed the hyp_debug branch from 6f5ad70 to 04ef809 Compare January 26, 2024 17:40

jesultra merged commit cf256a0 into lowRISC:master Jan 26, 2024
32 checks passed

jesultra deleted the hyp_debug branch January 26, 2024 21:00

timothytrippel added the CherryPick:earlgrey_es_sival This PR should be cherry-picked to earlgrey_es_sival label Feb 7, 2024

github-actions bot mentioned this pull request Feb 7, 2024

Cherry-pick to earlgrey_es_sival: [opentitantool] Introduce binary protocol for HyperDebug gpio monitoring #21251

Merged

	let mut cur_level = Vec::new();
	let mut cur_level = Vec::with_capacity(pin_names.len());

	let mut databytes: Vec<u8> = Vec::new();
	let mut data_bytes = Vec::with_capacity(resp.transcript_size as usize);

[opentitantool] Introduce binary protocol for HyperDebug gpio monitoring #20672

[opentitantool] Introduce binary protocol for HyperDebug gpio monitoring #20672

Conversation

jesultra commented Dec 18, 2023 • edited Loading

pamaury commented Jan 9, 2024

nbdd0121 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jesultra commented Jan 10, 2024

nbdd0121 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nbdd0121 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 7, 2024

jesultra commented Dec 18, 2023 •

edited

Loading