-
Notifications
You must be signed in to change notification settings - Fork 792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kmac] Rework masked SHA3 core and switch to Trivium-based PRNG implementation #21624
Conversation
Since a full-width PRNG is used that is able to produce 800 bits in each clock cycle, the control logic around requesting fresh randomness for remasking inside the DOM multipliers of keccak_2share can be simplified to simply request fresh randomness when it's needed in the next clock cycle. Signed-off-by: Pirmin Vogel <[email protected]>
Already before this change, keccak_round was basically responsible for the fine-grained control of the DOM multiplier input/output muxes but this wasn't obvious leading to a much more complicated design. Moving all these control signals up to keccak_round allows to simplify the code and more importantly, it paves the way for registering some critical signals to avoid glitches on the input/output muxed signals which is beneficial for SCA hardening. Signed-off-by: Pirmin Vogel <[email protected]>
By flopping these signals they are freed of glitches which is beneficial for SCA hardening. Signed-off-by: Pirmin Vogel <[email protected]>
This additional buffer stage prevents glitches occurring at the PRNG output (due to the non-linear S-Box layer) from propagating into the DOM multipliers inside the Keccak/SHA3 core. This is beneficial for SCA hardening. Signed-off-by: Pirmin Vogel <[email protected]>
Depending on the PRNG architecture and control, the externally provided randomness can be guaranteed to be stable when the inputs to the DOM multipliers don't change. Not using partial intermediate results to cover these cases allows saving some silicon area (minus 800 MUX2). However, it seems that PROLEAD currently cannot successfully analyze the design with this new option enabled. For this reason, we keep the multiplexers in the design. Signed-off-by: Pirmin Vogel <[email protected]>
a4495f6
to
7f6e23b
Compare
This commit switches the LFSR-based PRNG with an unrolled, Trivium-based PRNG implementation to avoid brute-forcing attacks on the LFSR states. The overall PRNG state decreases from 800 bits to 288 bits but due to the heavy unrolling, the primitive can still generate 800 bits per cycle as required by the masked SHA3 core. This resolves lowRISC#20828. Signed-off-by: Pirmin Vogel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vogelpi, great job - LGTM! Splitting up this PR into multiple commits and the offline discussion really helped in reviewing these code changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this quite extensive change and improvement @vogelpi!
(I focused my review on the RTL code and didn't notice red flags. For the changes to the other code, I think CI checks and your experiments + local tests should suffice.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the well structured PR @vogelpi! This is great!
Thanks everybody for your reviews, let's merge this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing catch-up, this all LGTM
This PR consists of several commits required to improve the masking in the KMAC block and to prevent brute-forcing attacks on the PRNG state. This PR resolves #20828.
The commits can be grouped into 3 groups:
From an SCA perspective, I have the following results so far:
currently running.are now done as well: we have no 1st and 2nd order leakage with 10 Mio traces. Again a big improvement compared to before :-)