From b2022cc5097ff805c039b80354a5479c5c0acf8d Mon Sep 17 00:00:00 2001 From: Andrea Caforio Date: Wed, 11 Dec 2024 10:01:42 +0100 Subject: [PATCH 1/2] [hmac, sw] Revert hash stop hang workaround In line with the RTL fix in #24771 for the hash stop hang bug described in #24767, this commit reverts four commits that implemented a software workaround for the issue. This workaround and its corresponding documentation is now obsolete. Reverted commits: - e869152 #25070 - 2b072c4 #25012 - 7f32449 #24966 - f591bea #24944 Signed-off-by: Andrea Caforio Co-authored-by: Martin Velay --- hw/ip/hmac/doc/programmers_guide.md | 11 +--- sw/device/lib/crypto/drivers/BUILD | 1 - sw/device/lib/crypto/drivers/hmac.c | 85 +++++------------------------ 3 files changed, 15 insertions(+), 82 deletions(-) diff --git a/hw/ip/hmac/doc/programmers_guide.md b/hw/ip/hmac/doc/programmers_guide.md index a0cac8d829787..05948e4200f9e 100644 --- a/hw/ip/hmac/doc/programmers_guide.md +++ b/hw/ip/hmac/doc/programmers_guide.md @@ -88,24 +88,17 @@ When SW doesn't know each instant at which a full message block is available, it The context that needs to be saved and restored is in the following registers: [`CFG`](registers.md#cfg), [`DIGEST_*`](registers.md#digest), and [`MSG_LENGTH_*`](registers.md#msg_length_lower). -**Due to an RTL bug, the hardware may not be able to easily recover from a `CMD.hash_stop` command (for more details, refer to [Issue #24767](https://github.com/lowRISC/opentitan/issues/24767)). -To avoid the hardware from entering this state, a stop-gap software workaround solution should be used (see [PR #24944](https://github.com/lowRISC/opentitan/pull/24944) for details). -The example usage below includes this stop-gap workaround solution marked in italics whereas the original guidance is crossed out.** - Each new message stream needs to be started *once* by setting the `CMD.hash_start` bit and finalized *once* by setting the `CMD.hash_process` bit. -To switch from one message stream to another, ~~set the `CMD.hash_stop` bit, wait for the `hmac_done` interrupt (or status bit)~~ _at a block boundary, insert delay which should be equivalent to at least 80 clock cycles_, save one context, _trigger `CMD.hash_process` to move the FSMs into a stable state_ and restore the other context, and then set the `CMD.hash_continue` bit. +To switch from one message stream to another, set the `CMD.hash_stop` bit, wait for the `hmac_done` interrupt (or status bit), save one context and restore the other, and then set the `CMD.hash_continue` bit. Here is an example usage pattern of this feature: 1. Start processing message stream A by configuring HMAC and then setting the `CMD.hash_start` bit. 1. Write an arbitrary number of message blocks to HMAC's `MSG_FIFO`. -1. _Insert a delay of least 80 clock cycles._ -~~1. Stop HMAC by setting the `CMD.hash_stop` bit and wait for the `hmac_done` interrupt (or poll the interrupt status register).~~ +1. Stop HMAC by setting the `CMD.hash_stop` bit and wait for the `hmac_done` interrupt (or poll the interrupt status register). 1. Save the context by reading the `DIGEST_0`..`15` and `MSG_LENGTH_`{`LOWER`,`UPPER`} registers. If the operation is keyed HMAC, the values of `KEY_0`..`X` registers also need to be maintained as part of the context, where `X` is the last register used for the given key length (e.g. for HMAC-256, `X=7`). However, key registers cannot be read from SW, therefore SW must maintain key values as part of its own context during initialization. Similarly, the value of the `CFG` register must also be preserved, and SW should keep its value separately, instead of reading it from `CFG` register. -1. _Trigger `CMD.hash_process`._ -1. _Wait for the `hmac_done` interrupt (or poll the interrupt status register)._ 1. Disable `sha_en` by updating `CFG` register, in order to clear the digest from stream A. This is necessary to also prevent leakage of intermediate context of one SW thread to another. 1. Repeat steps 1-5 for message stream B. diff --git a/sw/device/lib/crypto/drivers/BUILD b/sw/device/lib/crypto/drivers/BUILD index 819f946d51830..0fe8705302d42 100644 --- a/sw/device/lib/crypto/drivers/BUILD +++ b/sw/device/lib/crypto/drivers/BUILD @@ -183,7 +183,6 @@ cc_library( "//sw/device/lib/base:macros", "//sw/device/lib/base:memory", "//sw/device/lib/crypto/impl:status", - "//sw/device/lib/runtime:ibex", ], ) diff --git a/sw/device/lib/crypto/drivers/hmac.c b/sw/device/lib/crypto/drivers/hmac.c index 0366c415ff9f4..3d10c2d962f6e 100644 --- a/sw/device/lib/crypto/drivers/hmac.c +++ b/sw/device/lib/crypto/drivers/hmac.c @@ -9,7 +9,6 @@ #include "sw/device/lib/base/hardened.h" #include "sw/device/lib/base/memory.h" #include "sw/device/lib/crypto/impl/status.h" -#include "sw/device/lib/runtime/ibex.h" #include "hmac_regs.h" // Generated. #include "hw/top_earlgrey/sw/autogen/top_earlgrey.h" @@ -68,19 +67,6 @@ OT_ASSERT_ENUM_VALUE(HMAC_DIGEST_15_REG_OFFSET, HMAC_DIGEST_14_REG_OFFSET + 4); enum { /* The beginning of the address space of HMAC. */ kHmacBaseAddr = TOP_EARLGREY_HMAC_BASE_ADDR, - /* Timeout value for the polling - * As min 3 clock cycles are required to perform a read access to the STATUS - * hmac_idle register. And as we should observe 240 cycles (80 for the inner - * key, 80 for the outer key, and 80 for the result of the first round), plus - * 80 for msg itself -> 360 cycles in total as max (when HMAC is enabled). - * Which means, 360/3=120 loops before having the IDLE state. - * Let's take a large margin and consider that 200 loops are enough. - */ - kNumIterTimeout = 200, - /* Temporary delay linked to issue #24767, which should be equivalent to at - * least 80 clock cycles of the HMAC clock plus a margin. - */ - kHmacTmpDelay = 200, }; /** @@ -89,6 +75,8 @@ enum { * It returns error if HMAC HWIP becomes idle without firing `hmac_done` * interrupt. * + * TODO(#23191): It might be beneficial to have a timeout value for the polling. + * * @return Result of the operation. */ OT_WARN_UNUSED_RESULT @@ -97,13 +85,8 @@ static status_t hmac_idle_wait(void) { // Initialize `status_reg = 0` so that the loop starts with the assumption // that HMAC HWIP is not idle. uint32_t status_reg = 0; - uint32_t attempt_cnt = 0; while (bitfield_bit32_read(status_reg, HMAC_STATUS_HMAC_IDLE_BIT) == 0) { status_reg = abs_mmio_read32(kHmacBaseAddr + HMAC_STATUS_REG_OFFSET); - attempt_cnt++; - if (attempt_cnt >= kNumIterTimeout) { - return OTCRYPTO_FATAL_ERR; - } } // Verify that HMAC HWIP raises `hmac_done` bit. @@ -273,41 +256,6 @@ static void msg_fifo_write(const uint8_t *message, size_t message_len) { } } -/** - * Temporary workaround linked to issue #24767 - * - * The HMAC HWIP is not told to stop. This will cause the HW to be in an - * unexpected state, which could be exited by trigerring a simple HASH process - * operation. Before doing that, the context should be saved (message length - * and digest). This context is only available after a duration equivalent to - * 64 clock cycles in SHA2-256 and 80 clock cycles in SHA2-384/512, after the - * message length is on a block boundary (512 for SHA2-256 or 1024 bits for - * SHA2-384/512). - * - * @param[out] ctx Context to which values are written. - */ -static status_t tmp_avoid_hw_hang(hmac_ctx_t *ctx) { - // Insert a delay - ibex_timeout_t timeout; - timeout.cycles = kHmacTmpDelay; - timeout.start = ibex_mcycle_read(); - while (!ibex_timeout_check(&timeout)) { - // NULL statement - } - - // Save current context as it is updated after each block even if stop is not - // triggered - context_save(ctx); - - // Trigger hash_process - uint32_t cmd_reg = abs_mmio_read32(kHmacBaseAddr + HMAC_CMD_REG_OFFSET); - cmd_reg = bitfield_bit32_write(cmd_reg, HMAC_CMD_HASH_PROCESS_BIT, true); - abs_mmio_write32(kHmacBaseAddr + HMAC_CMD_REG_OFFSET, cmd_reg); - - // Wait for HMAC HWIP operation to be completed. - return hmac_idle_wait(); -} - /** * For given `hmac_mode`, derive the matching CFG value and block/digest * lengths. @@ -449,24 +397,17 @@ status_t hmac_update(hmac_ctx_t *ctx, const uint8_t *data, size_t len) { // Keep writing incoming bytes msg_fifo_write(data, len - leftover_len); - /* - * TODO should be uncommented once the issue #24767 will be solved in the HW - * and tmp_avoid_hw_hang should be removed. - * - * // Time to tell HMAC HWIP to stop, because we do not have enough message - * // bytes for another round. - * uint32_t cmd_reg = - * bitfield_bit32_write(HMAC_CMD_REG_RESVAL, HMAC_CMD_HASH_STOP_BIT, 1); - * abs_mmio_write32(kHmacBaseAddr + HMAC_CMD_REG_OFFSET, cmd_reg); - * - * // Wait for HMAC HWIP operation to be completed. - * HARDENED_TRY(hmac_idle_wait()); - * - * // Store context into `ctx`. - * context_save(ctx); - */ - - tmp_avoid_hw_hang(ctx); + // Time to tell HMAC HWIP to stop, because we do not have enough message + // bytes for another round. + uint32_t cmd_reg = + bitfield_bit32_write(HMAC_CMD_REG_RESVAL, HMAC_CMD_HASH_STOP_BIT, 1); + abs_mmio_write32(kHmacBaseAddr + HMAC_CMD_REG_OFFSET, cmd_reg); + + // Wait for HMAC HWIP operation to be completed. + HARDENED_TRY(hmac_idle_wait()); + + // Store context into `ctx`. + context_save(ctx); // Write leftover bytes to `partial_block`, so that future update/final call // can feed them to HMAC HWIP. From 5b0c3956d3712f51638b361728881c2d35c874ce Mon Sep 17 00:00:00 2001 From: Andrea Caforio Date: Fri, 13 Dec 2024 15:32:20 +0100 Subject: [PATCH 2/2] [hmac, sw] Reintroduce idle polling timeout The previously reverted commit f591bea (later refined in e869152) contained a desirable timeout implementation for the HMAC idle polling (see #23191). This commit reinstantiates it. Signed-off-by: Andrea Caforio Co-authored-by: Martin Velay --- sw/device/lib/crypto/drivers/hmac.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/sw/device/lib/crypto/drivers/hmac.c b/sw/device/lib/crypto/drivers/hmac.c index 3d10c2d962f6e..8085864f1cbb0 100644 --- a/sw/device/lib/crypto/drivers/hmac.c +++ b/sw/device/lib/crypto/drivers/hmac.c @@ -67,6 +67,15 @@ OT_ASSERT_ENUM_VALUE(HMAC_DIGEST_15_REG_OFFSET, HMAC_DIGEST_14_REG_OFFSET + 4); enum { /* The beginning of the address space of HMAC. */ kHmacBaseAddr = TOP_EARLGREY_HMAC_BASE_ADDR, + /* Timeout value for the polling + * As min 3 clock cycles are required to perform a read access to the STATUS + * hmac_idle register. And as we should observe 240 cycles (80 for the inner + * key, 80 for the outer key, and 80 for the result of the first round), plus + * 80 for msg itself -> 360 cycles in total as max (when HMAC is enabled). + * Which means, 360/3=120 loops before having the IDLE state. + * Let's take a large margin and consider that 200 loops are enough. + */ + kNumIterTimeout = 200, }; /** @@ -75,8 +84,6 @@ enum { * It returns error if HMAC HWIP becomes idle without firing `hmac_done` * interrupt. * - * TODO(#23191): It might be beneficial to have a timeout value for the polling. - * * @return Result of the operation. */ OT_WARN_UNUSED_RESULT @@ -85,8 +92,13 @@ static status_t hmac_idle_wait(void) { // Initialize `status_reg = 0` so that the loop starts with the assumption // that HMAC HWIP is not idle. uint32_t status_reg = 0; + uint32_t attempt_cnt = 0; while (bitfield_bit32_read(status_reg, HMAC_STATUS_HMAC_IDLE_BIT) == 0) { status_reg = abs_mmio_read32(kHmacBaseAddr + HMAC_STATUS_REG_OFFSET); + attempt_cnt++; + if (attempt_cnt >= kNumIterTimeout) { + return OTCRYPTO_FATAL_ERR; + } } // Verify that HMAC HWIP raises `hmac_done` bit.