Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase randomness_generation branch on main 01-11-2024 #1961

Merged
merged 292 commits into from
Nov 1, 2024

Conversation

torben-hansen
Copy link
Contributor

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

jargh and others added 30 commits April 25, 2024 08:04
The code now handles specially the case where P1 = (x,y,z) is the
point at infinity, i.e. has z = 0. It then returns the other point P2
augmented (since that is in affine coordinates, this being mixed
addition) with z = 1 or its Montgomery equivalent to give the more
desirable result 0 + P2 = P2. The selection is constant-time as
usual with a single code path.

s2n-bignum original commit: awslabs/s2n-bignum@72ccfda
This is analogous to the earlier changes for mixed addition. In a
point addition operation P1 + P2, the cases where P1 = 0 or P2 = 0 are
handled specially (though of course using constant-time selection) as
0 + P2 = P2 and P1 + 0 = P1. More precisely, writing P1 = (x1,y1,z1)
and P2 = (x2,y2,z2), the special-case logic is triggered when
precisely *one* of z1 = 0 or z2 = 0 holds; in the case that both
z1 = 0 and z2 = 0 the standard computation is followed and yields the
"right" result (one with its z coordinate also zero).

s2n-bignum original commit: awslabs/s2n-bignum@061ea51
As with the earlier update for doublings, the Jacobian point adidtion
and mixed addition operations for the curves P-256, P-384, P-521,
secp256k1 and SM2 now all have the usual two versions targeting
different microarchitectures, one of them called "_alt", following the
general s2n-bignum convention.

The "_alt" forms for ARM now present are just renamed versions of the
originals (which were based on "_alt" field operations), with the new
code taking over the old non-alt name. For x86 the non-alt ones are
the same as before and the "_alt" forms are new.

s2n-bignum original commit: awslabs/s2n-bignum@acd4fd3
Complete improvements to Weierstrass point additions
s2n-bignum original commit: awslabs/s2n-bignum@b9266e7
…in tactics

This patch adds `bignum_mont{mul,sqr}_p384_neon` which are slightly faster than
`bignum_mont{mul,sqr}_p384`.
They use SIMD instructions and better scheduling found with SLOTHY.
Their correctness is verified using equivalence check w.r.t. specifications of their scalar ops.
The new SUBROUTINE lemmas are added to the specification list using
```
./tools/collect-specs.sh arm >arm/proofs/specifications.txt
```

Benchmark results on Graviton2:
```
bignum_montsqr_p384             :    58.6 ns each (var  0.3%, corr  0.06) =   17053295 ops/sec
bignum_montsqr_p384_neon        :    52.6 ns each (var  0.4%, corr -0.04) =   19017192 ops/sec
bignum_montmul_p384             :    72.9 ns each (var  0.2%, corr -0.02) =   13726633 ops/sec
bignum_montmul_p384_neon        :    68.1 ns each (var  0.3%, corr  0.02) =   14680905 ops/sec
```

Test and benchmark were updated to include these & fix incorrect naming bugs
in my previous p256_neon patch.

Also, some speedups in tactics are made:

1. `ARM_STEPS'_AND_ABBREV_TAC` and `ARM_STEPS'_AND_REWRITE_TAC`.

They are tactics for symbolic execution when showing equivalence of two programs
after reordering instructions.
`ARM_STEPS'_AND_ABBREV_TAC` does symbolic execution of the 'left' program and
abbreviates every RHS of new `read comp s = RHS`s,
meaning that after the tactic is done there are a bunch of equality assumptions whose
number increases linearly to the number of instructions.
`ARM_STEPS'_AND_REWRITE_TAC` then does symbolic execution of the 'right' program
and rewrites the results using the assumptions.
This means the overall complexity of `ARM_STEPS'_AND_REWRITE_TAC` was quadratic
to the number of instructions (# assum * # insts = (# insts)^2).
This is fixed to be (close to) linear, by separately maintaining the
abbreviations as a list of theorems internally rather than assumptions.
This doesn’t mean that the therotical time complexity is now linear,
but many tactics inside `ARM_STEPS'_AND_REWRITE_TAC` that inspect assumptions
now run linearly.

2. `FIND_HOLE_TAC`

`FIND_HOLE_TAC` tactic finds the 'hole' in the memory space that can place the
machine code that is used in program equivalence. This is done by inspecting
`nonoverlapping` assumptions, properly segmenting the memory with fixed-width
ranges and doing case analysis. Previously the # splitted cases was something
like 2^((# segments)^2), but now it is reduced to (# segments)^(#segments).
Comparing these two numbers is easier if logarithm is used.

Finally, some lemmas in existing `_neon.ml` proofs are updated so that
they do not mix usage of '*_mc' and '*_core_mc'. '*_core_mc' is a machine
code that is a sub-list of '*_mc' retrieved by stripping the callee-save register
store/loads as well as the ret instruction.
If possible, a lemmas is updated to only use '*_core_mc' because this
makes the modular usage of the lemma possible in bigger theorems.

s2n-bignum original commit: awslabs/s2n-bignum@d3a7b19
Add `bignum_mont{mul,sqr}_p384_neon`, speed improvements/refactoring in tactics
s2n-bignum original commit: awslabs/s2n-bignum@cbef866
This patch adds `bignum_mont{sqr,mul}_p521_neon`.

```
bignum_montsqr_p521             :   114.7 ns each (var  0.2%, corr  0.06) =    8720010 ops/sec
bignum_montsqr_p521_neon        :    83.8 ns each (var  0.4%, corr -0.04) =   11926387 ops/sec
bignum_montmul_p521             :   130.8 ns each (var  0.2%, corr -0.00) =    7644702 ops/sec
bignum_montmul_p521_neon        :   111.4 ns each (var  0.2%, corr  0.04) =    8978421 ops/sec
```

The new subroutine specs are added to specification.txt, and test as well as benchmark are updated.

Modular squaring/multiplication functions are not included in this patch.

This patch also contains the following updates:

- A tactic for showing equivalence of loops is added (the tactic is not used yet).
- Definitions for input state equivalence are canonicalized as `.. /\ (?a. read c1 s = a /\ read c1 s' = a /\ (?b. read c2 s = b /\ read c2 s' = b /\ ( ... )))`
- Minor buggy behaviors in equiv tactics are fixed and performance improvements done

s2n-bignum original commit: awslabs/s2n-bignum@65f046e
Add `bignum_mont{sqr,mul}_p521_neon`
s2n-bignum original commit: awslabs/s2n-bignum@e6ac9bd
This adds `bignum_{sqr,mul}_p521_neon` and their proofs.

The new subroutine specs are added to specification.txt, and test as well as benchmark are updated.

Benchmark results on GV2 are:

```
bignum_mul_p521                 :   135.1 ns each (var  0.2%, corr -0.01) =    7404184 ops/sec
bignum_mul_p521_neon            :   115.5 ns each (var  0.3%, corr  0.00) =    8660108 ops/sec
bignum_sqr_p521                 :   108.9 ns each (var  0.2%, corr  0.08) =    9184994 ops/sec
bignum_sqr_p521_neon            :    78.7 ns each (var  0.3%, corr  0.06) =   12708368 ops/sec
```

s2n-bignum original commit: awslabs/s2n-bignum@02df8e4
P-256 scalar multiplication and related tweaks
s2n-bignum original commit: awslabs/s2n-bignum@2237fe8
P-256 precomputed point scalar multiplication
s2n-bignum original commit: awslabs/s2n-bignum@37c69f1
The new function bignum_montinv_p256 is a "Montgomery domain" variant
of the existing modular inverse function bignum_inv_p256, which seems
often to be better suited to the generally Montgomery-centric field
operations supplied by s2n-bignum for P-256. Given an input x it
returns a z such that x * z == 2^512 (mod p_256), or zero in cases
when x is in {0,p_256}. This does indeed correspond to inversion in
the Montgomery domain because if x == 2^256 * X and z == 2^256 * Z
(both mod p_256) then the congruence x * z == 2^512 (mod p_256) means
that X * Z == 1 (mod p_256). The code is in fact almost identical to
bignum_inv_p256, changing only the starting value for the iterations,
and the proof is very similar.

s2n-bignum original commit: awslabs/s2n-bignum@c07aee5
The new function p256_montjscalarmul[_alt] is analogous to the
existing p256_scalarmul[_alt], doing scalar multiplication n * P for a
point P on the NIST P-256 curve and a scalar n. This variant, however,
uses the Jacobian representation for both input and output points,
with the coordinates in Montgomery form. As such, it is approximately
the same as the "middle" of p256_scalarmul, excluding the mappings
from and back to affine form; it may make a more convenient
building-block for other operations.

s2n-bignum original commit: awslabs/s2n-bignum@0a8a754
Add global assumptions paragraph
s2n-bignum original commit: awslabs/s2n-bignum@d61796f
Update Arm cosimulator to check that opcodes appear, add x86_att test to CI
s2n-bignum original commit: awslabs/s2n-bignum@3eb104f
Fix cmov implementation in ML-KEM/Kyber to use our built-in
constant-time functions.
The OID values are taken from OQS's [openssl provider][1].

The only manual changes in this PR were to `crypto/obj/objects.txt`. The
rest were generated by running `go run objects.go` in the `crypto/obj/`
directory.

[1]:
https://github.com/open-quantum-safe/oqs-provider/blob/acd7181ae34ff05261d30d111eb5102d4f8c623a/ALGORITHMS.md?plain=1#L253-L254
Also some minor fixes from running doc.go over x509.h.

Bug: 426
Change-Id: I8d4cf0cc2161aa7ab8ff9461b6d99daa9d9360c6
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65211
Reviewed-by: Bob Beck <[email protected]>
Commit-Queue: David Benjamin <[email protected]>
(cherry picked from commit 91b7df2aff7d842fa305dd9e3ebee68f9b357e7c)
Goldmont, the successor to Silvermont in the Atom line, added XSAVE, so
these "MOVBE without XSAVE" code paths are strictly for Silvermont. The
code paths will still work on Silvermont but with reduced performance.

Change-Id: I57f530f487e0f9b6b3f6aac912dbfaaa46628b9f
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/64788
Commit-Queue: Bob Beck <[email protected]>
Reviewed-by: David Benjamin <[email protected]>
Reviewed-by: Bob Beck <[email protected]>
(cherry picked from commit 77ee4e4609cfb3480e1a554790348ebcab61313e)
Per the header, these symbols were private and only exported for
decrepit. But decrepit can include internal headers.

Change-Id: I1155f4b98252004b80a53efb0a6009400a6c59ac
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65687
Auto-Submit: David Benjamin <[email protected]>
Commit-Queue: Bob Beck <[email protected]>
Reviewed-by: Bob Beck <[email protected]>
(cherry picked from commit 7f45053d42ae0b5f4d5d96fa471d671c6d1462e9)
These macros are unnamespaced and only used in one file. Keep their
definitions more local. Also remove the #undefs at the end of des.c.
They date to when des.c was part of bcm.c, but it isn't anymore.

Change-Id: I6306929929f2304b93c3fdb2e787965c40729d6a
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/65688
Commit-Queue: David Benjamin <[email protected]>
Reviewed-by: Bob Beck <[email protected]>
(cherry picked from commit cba7adcd108e9a41a992b4c4fc18b050e4d05a66)
### Issues:

no issue is filed

### Description of changes: 

cosmetic CI speedup. 1-2 sec for each job

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
These macros help users of AWS-LC to know the HMAC precomputed key sizes
without having to create an HMAC context.

Note that we did not define them as `xxx_CHAINING_LENGTH * 2`, where
`xxx` is the hash function name, because `xxx_CHAINING_LENGTH` is an
internal macro we do not want to expose. Instead, we use a static assert
in `hmac.c` to verify the macros are consistent.y.
OpenSSL's hmac command line tool uses `dgst` with a `-hmac` option.
Supporting the entire functionality of `dgst` will take a bit more work,
but we can support `-hmac` with the default `SHA256` for the time being.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
davidben and others added 25 commits October 18, 2024 13:09
This is still a bit more tedious than I'd like, but we've got three of
these and I'm about to add a fourth. Add something like Chromium's base
class. But where Chromium integrates the base class directly with
scoped_refptr (giving a place for a static_assert that you did the
subclassing right), we don't quite have that since we need to integrate
with the external C API.

Instead, use the "passkey" pattern and have RefCounted<T>'s protected
constructor take a struct that only T can construct. The passkey ensures
that only T can construct RefCounted<T>, and the protectedness ensures
that T subclassed RefCounted<T>. (I think the latter already comes from
the static_cast in DecRef, but may as well.)

Change-Id: Icf4cbc7d4168010ee46dfa3a7b0a2e7c20aaf383
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/66369
Reviewed-by: Bob Beck <[email protected]>
Commit-Queue: David Benjamin <[email protected]>
(cherry picked from commit fbf10f0d968beb56622eb4927bace53a0e931189)
…ws#1923)

### Cherry-picked commit
*
google/boringssl@72a6050

### Original commit message
This is extremely silly and a huge waste of everyone's time (I deeply
regret disambiguating illegal_parameter and decode_error for RFC 8446),
but so it goes. Technically these errors pass the TLS syntax and are an
invalid value, so they should be illegal_parameter.

   Note: TLS defines two generic alerts (see Section 6) to use upon
   failure to parse a message.  Peers which receive a message which
   cannot be parsed according to the syntax (e.g., have a length
   extending beyond the message boundary or contain an out-of-range
   length) MUST terminate the connection with a "decode_error" alert.
   Peers which receive a message which is syntactically correct but
   semantically invalid (e.g., a DHE share of p - 1, or an invalid enum)
   MUST terminate the connection with an "illegal_parameter" alert.

Update-Note: The error sent on invalid key share is now more correct.
This does not change which connections do or do not fail, only which of
two practically identical alert codes is sent to the other side.

Change-Id: If0ddf511d6cf23383c6134ad30e3ae080c4f2769
Reviewed-on:
https://boringssl-review.googlesource.com/c/boringssl/+/71627
Auto-Submit: David Benjamin <[email protected]>
Reviewed-by: Bob Beck <[email protected]>
Commit-Queue: Bob Beck <[email protected]>

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.

Co-authored-by: David Benjamin <[email protected]>
Could theoretically be null I suppose.
### Issues:
Resolves CryptoAlg-2696

### Description of changes: 
* Add support for callbacks in DH paramgen

### Call-outs:
* EVP_PKEY_paramgen uses the same setup for callbacks as
EVP_PKEY_keygen: https://docs.openssl.org/3.0/man3/EVP_PKEY_keygen/

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
### Description of changes: 
Remove old and/or unsupported Intel CPU types from SDE tests.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
FIPS review: The module should enter an error state if PCT fails in EC
or RSA key generation, so there should be no retries and it aborts. This
is to avoid that other threads would continue to use the module.
### Description of changes: 
* Add back "p4p" testing.
* Increase timeout for test.
* Clarify error message.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
To better communicate our current support for PQ algorithms in AWS-LC and their integrations, we add the first draft of a PQ README documentation. This documentation is aimed to help users of AWS-LC understand which PQ algorithms are provided, give more details into their security levels, key/signature/ciphertext sizes.
### Issues:
Addresses P162082038

### Description of changes: 
Bump the CI to use MySQL 9.1.0

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
### Description of changes: 
* ACVP testing for SHA2-512/224 and SHA2-512/256 for PBKDF and KDA:
HKDF.
* Update service indicator for PBKDF, HKDF, and HKDF_expand (KBKDF
feedback mode) to support SHA2-512/224 and SHA2-512/256 as approved
algorithms.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
### Issues:
CryptoAlg-2552

### Description of changes: 
This PR adds Nmap tip of main to our CI. It includes: 
1. A patch to Nmap
2. Adding an include for <errno.h> to <err.h>, OpenSSL does this but we
previously did not
3. Adding BIO_destroy_bio_pair (the functionality was already
implemented, we are simply exposing that through a new func)
4. Creating a new alias for "DES" in the EVP_CIPHER API. The "default"
mode used in OpenSSL 1.1.1 and 3.1 with DES is CBC
[(link)](https://docs.openssl.org/1.1.1/man1/enc/#supported-ciphers:~:text=des%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20Alias%20for%20des%2Dcbc).

### Testing:
Locally built and ran nmap tests with AWS-LC and verified they pass.
Conducted a dry run with the extra included file to ensure it doesn't
break any consumers in live (refer to CryptoAlg-2552 for the dryrun
link).

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
### Description of changes: 
Fix FIPS.md typo - "4759"


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
Ruby has tests asserting that OpenSSL encodes and decodes infinity
points to 0. This doesn't quite make sense, but our hand is sort of
forced here. It does show that there's dependence on this subtle
behavior and we might as well support it since we're trying to gain
better compatibility.

```
    assert_equal false, point.infinity?
    point.set_to_infinity!
    assert_equal true, point.infinity?
    assert_equal 0.to_bn, point.to_bn
    assert_equal B(%w{ 00 }), point.to_octet_string(:uncompressed)
    assert_equal true, point.on_curve?
```

### Call-outs:
Tweak against changes from these commits;
* d9e2702
* a7185da

### Testing:
Tweaked existing tests that checked we did not support this behavior. 

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
This change introduces a new filter BIO, `BIO_f_cipher` for use in PR
1816.

The cipher BIO sits in front of a backing BIO, encrypting incoming
writes before writing to the backing BIO and decrypting data read from
the backing BIO. This implementation is almost an exact copy of
[OpenSSL's](https://github.com/openssl/openssl/blob/8e0d479b98357bb20ab1bd073cf75f7d42531553/crypto/evp/bio_enc.c#L59)
with some functionality removed. We try to change as little of the
underlying logic as possible, but rename variables and add comments for
clarity.
Ruby test-cases expect to be able to set the digest for the DSA paramgen
operation.


By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
This is the first time it has come up, but Ruby depends on the
`POINT_CONVERSION_HYBRID` format in its tests and expects the underlying
libcrypto to support it. There's not too much a difference between
`POINT_CONVERSION_HYBRID` and
`POINT_CONVERSION_UNCOMPRESSED` except that the first bit contains
information about the quadratic equation y is at. We have the building
blocks to support and can get this working with some tweaking around.
Most of this change is working with the first byte
`POINT_CONVERSION_HYBRID` does differently and point the encoding to
the right places for parsing.

### Testing:
* Testing parsing from and back with `POINT_CONVERSION_HYBRID`. What
we're parsing to and from hybrid doesn't really matter here, they're
already covered by other code paths.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
Implements the basic randomness generation code path. Dependencies are
now:
```
             +-----+                   
      +----->|ube.c|<--------+         
      |      +-----+         |         
      |                      |         
      |                      |         
+-----+----+         +-----------------+
|new_rand.c|+------->|entropy_sources.c|
+----------+         +-----------------+
```
randomness generation is implemented in `new_rand.c`. It now depends on
`entropy_sources.c` for entropy needs. Eventually, these will depend on
`ube.c` that will provide services to protect volatile memory. `ube.c`
is not implemented yet, but e.g. `rand_ensure_ctr_drbg_uniquness()` will
depend on it.
Implements the uniqueness-breaking event detection backend with the new
function `get_ube_generation_number()` being the entry point. This is
basically an abstraction on top of the existing detection methods AWS-LC
currently support. The implementation in this PR includes the UBE logic
and excludes:
- Using the new function `get_ube_generation_number()` in the new
randomness generation.
- Using the fork/snapsafe event detection methods
`CRYPTO_get_fork_generation` and `CRYPTO_get_snapsafe_generation`.

These fill follow later.

The UBE subsystem is stateful maintaining the global state `ube_state`
which contain the global ube generation number. This value can be used
to protect volatile state.
This PR implements a default entropy source object for the randomness generation. This is done by defining the function table that is already assumed by the randomness generation code in new_rand.c. The configuration is similar to the one already being used but simplified. It's now assumed that an entropy source can never not be initialised which seems reasonable...

Since the entropy source method table is a const object, get_entropy_source() is changed to return a reference to it. Related code is also changed to take this into account.

Finally, the entropy source object "cleanup" function is changed to not return anything. It's called at thread exit and if it fails there is no way to recover anyway.
This PR integrates protection of the thread-local state into the new randomness generation implementation. rand_ensure_ctr_drbg_uniquness() is the function that determines whether a randomization of the thread-local state is necessary. rand_ensure_ctr_drbg_uniquness() is called inline in the core randomness generation code path (in RAND_bytes_core() and invoked on every entry.

The only mechanism currently implemented that can force a randomization is the UBE mechanism implemented in bc7aeff. Note that if UBE is "unavailable" then a randomization is forced every time.
To avoid synchronisation issues for codepoints that reads/mutates the ctr-drbg state in aws#1919, we must reorganise the reseed logic. This decouples the code that reads/mutates the ctr-drbg state and the code that gathers entropy. The latter is not an issue. The former will later be wrapped as critical code needing synchronisation as part of the global zeroisation.
@torben-hansen torben-hansen requested a review from a team as a code owner November 1, 2024 21:43
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 78.59113% with 1009 lines in your changes missing coverage. Please review.

Please upload report for BASE (randomness_generation@73aa0bd). Learn more about missing BASE report.

Files with missing lines Patch % Lines
crypto/bio/bio.c 74.84% 121 Missing ⚠️
crypto/bio/connect.c 56.34% 117 Missing ⚠️
crypto/asn1/tasn_dec.c 83.61% 69 Missing ⚠️
crypto/bio/hexdump.c 0.00% 67 Missing ⚠️
crypto/asn1/a_utf8.c 50.39% 63 Missing ⚠️
crypto/bio/file.c 60.60% 52 Missing ⚠️
crypto/bio/fd.c 50.00% 46 Missing ⚠️
crypto/asn1/a_strex.c 83.54% 38 Missing ⚠️
crypto/base64/base64.c 85.59% 34 Missing ⚠️
crypto/asn1/a_int.c 84.28% 33 Missing ⚠️
... and 28 more
Additional details and impacted files
@@                   Coverage Diff                    @@
##             randomness_generation    #1961   +/-   ##
========================================================
  Coverage                         ?   78.77%           
========================================================
  Files                            ?      595           
  Lines                            ?   101731           
  Branches                         ?    14418           
========================================================
  Hits                             ?    80139           
  Misses                           ?    20955           
  Partials                         ?      637           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@torben-hansen torben-hansen merged commit be23ae5 into aws:randomness_generation Nov 1, 2024
113 of 116 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.