[rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25146

nasahlpa · 2024-11-14T16:34:47Z

As the SecureIbex configuration is the one we use for the Earl Grey ASIC, we also should enable it for the FPGA boards.

Due to resource constraints, this is only possible for the CW340 but not the CW310.

Report for CW340 with SecureIbex enabled:
Resource utilization stays still well below 70%:

And timing is met:

Report for CW340 with SecureIbex disabled:

Closes #25137

vogelpi

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?

The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

vogelpi · 2024-11-15T09:31:43Z

hw/top_earlgrey/templates/chiplevel.sv.tpl

 % elif target["name"] == "cw310":
    .KmacEnMasking(0),
    .KmacSwKeyMasked(1),
    .KeymgrKmacEnMasking(0),
    .SecKmacCmdDelay(0),
    .SecKmacIdleAcceptSwMsg(1'b0),
+    .RvCoreIbexSecureIbex(0),


You probably also should add that for the cw305.

Ah thanks, done.

nasahlpa · 2024-11-15T09:36:55Z

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?

The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

About that:

I get the same error on a local CW340 with the new bitstream:
Command result: SFDP header contains incorrect signature: 0xffffffff
as in CI.
I am a bit stuck here. Do you know how I can debug this?

nasahlpa · 2024-11-15T12:29:18Z

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?

The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

I've updated the PR description with area and timing numbers before and after this PR.

vogelpi · 2024-11-15T12:46:43Z

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?
The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

About that:

I get the same error on a local CW340 with the new bitstream: Command result: SFDP header contains incorrect signature: 0xffffffff as in CI. I am a bit stuck here. Do you know how I can debug this?

Thanks for adding the area and timing numbers. I think the change is reasonable for the big board. About the SFDP header issue, I have no idea abou this. Maybe someone from the software team knows?

a-will · 2024-11-15T15:21:10Z

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?
The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

About that:
I get the same error on a local CW340 with the new bitstream: Command result: SFDP header contains incorrect signature: 0xffffffff as in CI. I am a bit stuck here. Do you know how I can debug this?

Thanks for adding the area and timing numbers. I think the change is reasonable for the big board. About the SFDP header issue, I have no idea abou this. Maybe someone from the software team knows?

That likely means bootstrapping failed at the very first step, and the chip gave no response to SPI. MISO was left high when the tester tried to read the SFDP table.

For the prior PR state and its associated CI jobs, I found no evidence any software had run on CW340's Ibex. The ROM did not seem to print any identification messages.

nasahlpa · 2024-11-15T15:36:06Z

Thanks @nasahlpa for looking into this. Do you know by how much the resource utilization increases roughtly?
The changes look mostly good, but there are FPGA CI failures and we should make sure all tests pass before merging this.

About that:
I get the same error on a local CW340 with the new bitstream: Command result: SFDP header contains incorrect signature: 0xffffffff as in CI. I am a bit stuck here. Do you know how I can debug this?

Thanks for adding the area and timing numbers. I think the change is reasonable for the big board. About the SFDP header issue, I have no idea abou this. Maybe someone from the software team knows?

That likely means bootstrapping failed at the very first step, and the chip gave no response to SPI. MISO was left high when the tester tried to read the SFDP table.

For the prior PR state and its associated CI jobs, I found no evidence any software had run on CW340's Ibex. The ROM did not seem to print any identification messages.

Thanks @a-will. Do you know how I can debug this?

GregAC · 2024-11-15T16:43:35Z

@nasahlpa what test failed? Can't find the older CI logs and still awaiting the run for the newer one.

At a guess the bootloader has immediately fallen over because we get an instant alert from Ibex, maybe because the lockstep hasn't come out of reset properly? There's some prims involved that may have FPGA specific implementations that could explain the behaviour difference.

nasahlpa · 2024-11-15T16:45:43Z

@nasahlpa what test failed? Can't find the older CI logs and still awaiting the run for the newer one.

At a guess the bootloader has immediately fallen over because we get an instant alert from Ibex, maybe because the lockstep hasn't come out of reset properly? There's some prims involved that may have FPGA specific implementations that could explain the behaviour difference.

All of them. Not a single test on CW340 passed. Here are the old results from CI.

nasahlpa · 2024-11-17T10:24:14Z

I investigated this and discovered the following:

When enabling the SecureIbex paramter for CW340, a major Ibex alert is triggered by a register file ECC check. Hence, the cores does not boot up and we cannot bootstrap software.

When either setting the RegFileECC = 1'b0 in the ibex_top module or forcing the ECC error signal to 0, the core boots up and I can successfully run SW on CW340 with the SecureIbex parameter on.

My assumption is that the ibex_register_file_fpga module does not correctly work with RegFileECC = 1'b1. I had a quick look into the FPGA register file - it though seems that we are correctly initializing the RF content with the ECC encoded 0.

In the long term, we should investigate this and fix it.

As a temporary solution, we could bypass the error:

Disable the RegFileECC option.
- Requires decoupling this option from the SecureIbex parameter, i.e., changes in the Ibex repository as well as in top_earlgrey as well as chip_earlgrey_cw340 in the OpenTitan repository are needed.
Use the FF register file.
- When the SecureIbex parameter is enabled for CW340, use the FF instead of the FPGA register file
- Probably the easiest solution for now as we only need to modify chip_earlgrey_cw340 in the OpenTitan repository.

I tested both solutions and they work.

WDYT @GregAC?

Edit: ah, after discussing with @vogelpi about this I think the error is here:
https://github.com/lowRISC/ibex/blob/169785d0711335c94561a93146e069766eec138c/rtl/ibex_register_file_fpga.sv#L46
We should use here WordZeroVal instead of 0. I'll give it a try and report back.

GregAC · 2024-11-18T10:23:25Z

@nasahlpa nice work on diagnosing the issue. I'd favour just using the FF register file in FPGA, assuming the FPGA implementation still works.

The reason to use the FPGA RF is efficient resource utilization I don't think switching to the FF version will have a meaningful effect given the full earl grey design is huge.

We should work out why the FPGA version is broken in this instance. I'm happy to take a look at this.

And having written that just seen the edit! If that's the quick fix then great go for it, otherwise switch to FF version would be my vote.

We should use `WordZeroVal` instead of `0` for reads from register `x0` in the FPGA register file. This bug was discovered when enabling the `RegFileECC` parameter. When this is enabled, the core performs ECC checks, expecting that `WordZeroVal` is returned for `x0`. Else, we get a major alert. Fixes lowRISC/opentitan#25146 Signed-off-by: Pascal Nasahl <[email protected]>

nasahlpa · 2024-11-18T12:02:51Z

The error was indeed the zero value for x0.
This will be fixed once #2224 in the Ibex repository is merged.

Thanks all for helping me debugging this issue.

vogelpi

LGTM!

vogelpi · 2024-11-19T14:00:26Z

hw/vendor/lowrisc_ibex/rtl/ibex_top.sv

-  localparam bit          RegFileECC            = SecureIbex;
+  localparam bit          RegFileECC            = 1'b0;


This change should be removed before merging this.

vogelpi · 2024-11-22T14:08:44Z

This got accidentally closed again by someone merging a related commit in the base repo. @nasahlpa , the reason is the "fixes #pr-no" in the commit message. I think it will happen again :-(

nasahlpa · 2024-11-24T16:44:16Z

After fixing the register file, the core still does not boot up. I had a closer look and it turns out that when I exclude the rf_wdata_wb_ecc signal from the lockstep comparison, the core boots up. When adding the comparison for this signal, it fails again with the same message as above. Even when I exclude the ECC bits of this signal:

  assign outputs_mismatch =
    (enable_cmp_q != IbexMuBiOff) & ((shadow_outputs_q.crash_dump != core_outputs_q[0].crash_dump) | 
                                     (shadow_outputs_q.double_fault_seen != core_outputs_q[0].double_fault_seen) |
                                     (shadow_outputs_q.rf_raddr_a != core_outputs_q[0].rf_raddr_a) |
                                     (shadow_outputs_q.rf_raddr_b != core_outputs_q[0].rf_raddr_b) |
                                     (shadow_outputs_q.rf_waddr_wb != core_outputs_q[0].rf_waddr_wb) |
                                     (shadow_outputs_q.rf_we_wb != core_outputs_q[0].rf_we_wb) |
                                     (shadow_outputs_q.ic_tag_req != core_outputs_q[0].ic_tag_req) |
                                     (shadow_outputs_q.ic_tag_write != core_outputs_q[0].ic_tag_write) |
                                     (shadow_outputs_q.ic_tag_addr != core_outputs_q[0].ic_tag_addr) |
                                     (shadow_outputs_q.ic_tag_wdata != core_outputs_q[0].ic_tag_wdata) |
                                     (shadow_outputs_q.ic_data_req != core_outputs_q[0].ic_data_req) |
                                     (shadow_outputs_q.ic_data_write != core_outputs_q[0].ic_data_write) |
                                     (shadow_outputs_q.ic_data_addr != core_outputs_q[0].ic_data_addr) |
                                     (shadow_outputs_q.ic_data_wdata != core_outputs_q[0].ic_data_wdata) |
                                     (shadow_outputs_q.ic_scr_key_req != core_outputs_q[0].ic_scr_key_req) |
                                     (shadow_outputs_q.instr_req != core_outputs_q[0].instr_req) |
                                     (shadow_outputs_q.instr_addr != core_outputs_q[0].instr_addr) |
                                     (shadow_outputs_q.data_req != core_outputs_q[0].data_req) |
                                     (shadow_outputs_q.data_we != core_outputs_q[0].data_we) |
                                     (shadow_outputs_q.data_be != core_outputs_q[0].data_be) |
                                     (shadow_outputs_q.data_addr != core_outputs_q[0].data_addr) |
                                     (shadow_outputs_q.data_wdata != core_outputs_q[0].data_wdata) |
                                     (shadow_outputs_q.dummy_instr_id != core_outputs_q[0].dummy_instr_id) |
                                     (shadow_outputs_q.dummy_instr_wb != core_outputs_q[0].dummy_instr_wb) |
                                     (shadow_outputs_q.irq_pending != core_outputs_q[0].irq_pending) |
                                     (shadow_outputs_q.rf_wdata_wb_ecc[31:0] != core_outputs_q[0].rf_wdata_wb_ecc[31:0]));

the core does not boot up. I've tried the following:

Disabling RegFileECC, then rf_wdata_wb_ecc_o = rf_wdata_wb
Disabling the WriteBack stage as rf_wdata_wb is driven inside the ibex_wb_stage module

But still no success.

I also had a look whether all signals are properly initialized on a reset, this looks good. However, as rf_wdata_wb directly gets data over the LSU from memory, it could be that some memory is not properly initialized? Strangely the lockstep comparison error for this signal only fails after I splice the bitstream to add the OTP and ROM memory. When I just create a bitstream with:
./bazelisk.sh build //hw/bitstream/vivado:fpga_cw340
this bitstream works.

Before setting up the Vivado ILA debugger, maybe @vogelpi and @GregAC could have again a look?

As the `SecureIbex` configuration is the one we use for the Earl Grey ASIC, we also should enable it for the FPGA boards. Due to resource constraints, this is only possible for the CW340 but not the CW310. Resource utilization stays still will below 70%. Closes lowRISC#25137 Signed-off-by: Pascal Nasahl <[email protected]>

nasahlpa · 2024-12-08T17:25:48Z

Issue is resolved (lowRISC/ibex#2228) and CW340 with enabled SecureIbex configuration now successfully builds & runs test. Hence, I am merging this now

After enabling the SecureIbex parameter on CW340 (c.f., lowRISC#25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]>

By enabling the SecureIbex parameter for CW340 (see lowRISC#25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]>

github-actions · 2024-12-08T17:26:31Z

Successfully created backport PR for earlgrey_1.0.0:

Cherry-pick to earlgrey_1.0.0: [rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25536

After enabling the SecureIbex parameter on CW340 (c.f., lowRISC#25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]>

By enabling the SecureIbex parameter for CW340 (see lowRISC#25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]>

After enabling the SecureIbex parameter on CW340 (c.f., #25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]>

By enabling the SecureIbex parameter for CW340 (see #25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]>

After enabling the SecureIbex parameter on CW340 (c.f., #25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]> (cherry picked from commit 4c04dfd)

By enabling the SecureIbex parameter for CW340 (see #25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]> (cherry picked from commit 92439bb)

After enabling the SecureIbex parameter on CW340 (c.f., #25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]> (cherry picked from commit 4c04dfd)

By enabling the SecureIbex parameter for CW340 (see #25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]> (cherry picked from commit 92439bb)

After enabling the SecureIbex parameter on CW340 (c.f., lowRISC#25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]>

By enabling the SecureIbex parameter for CW340 (see lowRISC#25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]>

After enabling the SecureIbex parameter on CW340 (c.f., lowRISC#25146], the otbn_mem_scramble_test now also can be executed on the CW340. Signed-off-by: Pascal Nasahl <[email protected]>

By enabling the SecureIbex parameter for CW340 (see lowRISC#25146), TLUL ECC errors now trigger an interrupt. This commit modifies the scram_ctrl_scrambled_access test such that the test checks whether we get the interrupt on CW340. Signed-off-by: Pascal Nasahl <[email protected]>

nasahlpa force-pushed the secure_ibex_cw340 branch from f426eb6 to c3ed16d Compare November 15, 2024 07:02

vogelpi reviewed Nov 15, 2024

View reviewed changes

nasahlpa force-pushed the secure_ibex_cw340 branch from c3ed16d to 4369e7c Compare November 15, 2024 12:26

nasahlpa mentioned this pull request Nov 18, 2024

[rtl] Fix zero value in FPGA RF lowRISC/ibex#2224

Merged

nasahlpa closed this in lowRISC/ibex#2224 Nov 18, 2024

nasahlpa reopened this Nov 19, 2024

nasahlpa force-pushed the secure_ibex_cw340 branch from 4369e7c to 60e27e4 Compare November 19, 2024 08:43

vogelpi approved these changes Nov 19, 2024

View reviewed changes

nasahlpa marked this pull request as ready for review November 19, 2024 09:11

nasahlpa added the CherryPick:earlgrey_1.0.0 This PR should be cherry-picked to earlgrey_1.0.0 label Nov 19, 2024

nasahlpa force-pushed the secure_ibex_cw340 branch 4 times, most recently from a020f8b to 5ad1420 Compare November 19, 2024 13:19

vogelpi reviewed Nov 19, 2024

View reviewed changes

marnovandermaas closed this in marnovandermaas/ibex@84232a5 Nov 22, 2024

vogelpi reopened this Nov 22, 2024

nasahlpa force-pushed the secure_ibex_cw340 branch 2 times, most recently from 21deb87 to ce82d80 Compare December 8, 2024 14:34

nasahlpa merged commit 71630d1 into lowRISC:master Dec 8, 2024
37 checks passed

nasahlpa deleted the secure_ibex_cw340 branch December 8, 2024 17:26

github-actions bot mentioned this pull request Dec 8, 2024

Cherry-pick to earlgrey_1.0.0: [rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25536

Merged

nasahlpa mentioned this pull request Dec 8, 2024

[sival] Enable CW340 exec. environment for ECC tests #25537

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25146

[rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25146

nasahlpa commented Nov 14, 2024 •

edited

Loading

vogelpi left a comment

vogelpi Nov 15, 2024

nasahlpa Nov 15, 2024

nasahlpa commented Nov 15, 2024

nasahlpa commented Nov 15, 2024

vogelpi commented Nov 15, 2024

a-will commented Nov 15, 2024 •

edited

Loading

nasahlpa commented Nov 15, 2024

GregAC commented Nov 15, 2024

nasahlpa commented Nov 15, 2024

nasahlpa commented Nov 17, 2024 •

edited

Loading

GregAC commented Nov 18, 2024

nasahlpa commented Nov 18, 2024

vogelpi left a comment

vogelpi Nov 19, 2024

vogelpi commented Nov 22, 2024

nasahlpa commented Nov 24, 2024

nasahlpa commented Dec 8, 2024

github-actions bot commented Dec 8, 2024

		localparam bit RegFileECC = SecureIbex;
		localparam bit RegFileECC = 1'b0;

[rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25146

[rv_core_ibex, top_earlgrey] Enable SecureIbex for CW340 #25146

Conversation

nasahlpa commented Nov 14, 2024 • edited Loading

vogelpi left a comment

Choose a reason for hiding this comment

vogelpi Nov 15, 2024

Choose a reason for hiding this comment

nasahlpa Nov 15, 2024

Choose a reason for hiding this comment

nasahlpa commented Nov 15, 2024

nasahlpa commented Nov 15, 2024

vogelpi commented Nov 15, 2024

a-will commented Nov 15, 2024 • edited Loading

nasahlpa commented Nov 15, 2024

GregAC commented Nov 15, 2024

nasahlpa commented Nov 15, 2024

nasahlpa commented Nov 17, 2024 • edited Loading

GregAC commented Nov 18, 2024

nasahlpa commented Nov 18, 2024

vogelpi left a comment

Choose a reason for hiding this comment

vogelpi Nov 19, 2024

Choose a reason for hiding this comment

vogelpi commented Nov 22, 2024

nasahlpa commented Nov 24, 2024

nasahlpa commented Dec 8, 2024

github-actions bot commented Dec 8, 2024

nasahlpa commented Nov 14, 2024 •

edited

Loading

a-will commented Nov 15, 2024 •

edited

Loading

nasahlpa commented Nov 17, 2024 •

edited

Loading