-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mirage-crypto-rng.unix thread safety? #249
Comments
Hi, currently, only the You can find (interesting) discussions on this subject in this PR: #227 I agree with you that we should improve the documentation regarding mirage-crypto-rng and OCaml 5 - there is a question but we have not, at this stage, found any really satisfactory solutions between OCaml 4, OCaml 5, unikernels (where |
A possible solution, a bit more ‘handy’, would be to rewrite Pfortuna but with the Mutex/Condition that OCaml 5 offers and to initialise the RNG with such a module. In this case, I suspect that mirage-crypto-rng could be domain-safe but we'd have to confirm with TSan. |
thread safety is a concern on OCaml 4 too, not just OCaml 5 (on OCaml 5 it'll be just a lot more obvious, because if something wasn't thread safe on OCaml 4, it likely won't be domain safe on OCaml 5 either). |
Another possibility for -rng and -lwt is to ditch the fortuna, and use getrandom/getentropy directly. That can be done for async and eio and miou as well. Fortuna, and why we use it, is mainly for MirageOS where we don't have getrandom. And earlier (when mirage-crypto/nocrypto started) getrandom wasn't available. WDYT? The only issue I can think of is that code used in fewer environments breaks sooner (or a breakage is discovered only late). |
That will likely be quite slow if we release the runtime lock every time. Although since 'getrandom' is not mean to perform IO I think we could get away with not releasing the runtime lock, and then the syscall should be quite fast. But isn't |
hmm, I see what you mean. Indeed, getentropy is POSIX, getrandom not AFAIR. I don't think we need to release the runtime lock since it doesn't do IO neither allocation. |
Dear @edwintorok, would you mind to run some tests on the hardware you have in mind? If the answer is yes, there's a branch named If you check that out, and do a Please note that on GNU/Linux, Performance on my FreeBSD 14 laptop (i7-5600U CPU @ 2.60GHz):
Note the cap on the 256 bytes in respect to performance. Now, a separate question is whether this performance difference (I expect it to behave differently on GNU/Linux) is acceptable or not? |
On AMD Ryzen 9 7950X (I can test on Intel hardware tomorrow):
I've added another method for comparison by reading from /dev/urandom using a mutex protected input channel, diff --git a/bench/dune b/bench/dune
index dec1e4f..0a071e1 100644
--- a/bench/dune
+++ b/bench/dune
@@ -2,7 +2,7 @@
(names speed)
(modules speed)
(libraries mirage-crypto mirage-crypto-rng mirage-crypto-rng.unix
- mirage-crypto-pk mirage-crypto-ec))
+ mirage-crypto-pk mirage-crypto-ec threads.posix))
; marking as "(optional)" leads to OCaml-CI failures
; marking with "(package mirage-crypto-rng-miou-unix)" only has an effect with a "public_name"
diff --git a/bench/speed.ml b/bench/speed.ml
index c9db764..672715f 100644
--- a/bench/speed.ml
+++ b/bench/speed.ml
@@ -491,6 +491,15 @@ let benchmarks = [
throughput name (fun buf ->
let buf = Bytes.unsafe_of_string buf in
Mirage_crypto_rng_unix.getrandom_into buf ~off:0 ~len:(Bytes.length buf))) ;
+ bm "urandom-channel" (fun name ->
+ In_channel.with_open_bin "/dev/urandom" @@ fun ic ->
+ let m = Mutex.create () in
+ let finally () = Mutex.unlock m in
+ throughput name (fun buf ->
+ let buf = Bytes.unsafe_of_string buf in
+ Mutex.lock m;
+ Fun.protect ~finally (fun () -> really_input ic buf 0 (Bytes.length buf))));
]
let help () = |
Thanks @edwintorok. I added the urandom-channel to the branch. I also added a pfortuna, which is the miou-fortuna that is safe to use concurrently. Results (again, FreeBSD, on my laptop):
What is there to conclude? Use |
/dev/urandom needs locking too, but is relatively simple to do as shown in the benchmark.
For my purposes that is actually fast enough (as shown above), and can be implemented without depending on Mirage_crypto (although other parts of our application indirectly depends on mirage-crypto I think), but might be nice to have a Mirage_crypto generator that just wraps a /dev/urandom channel. OTOH all those workarounds make the code more complicated, I don't think that Fortuna needs to be made thread safe, but please add a warning to its docs that it isn't and its use should be avoided outside of unikernels (correctness, and security i.e. not generating the same random number twice is more important than performance IMHO, unless you're very sure your application will only ever have 1 thread or 1 domain, which you can only guarantee with unikernels. Even with Lwt you could have regular threads if you use Lwt_preemptive.detach). To make this more robust Fortuna could be moved to a mirage specific opam package, then applications can't accidentally link it (although this would be a breaking change). |
Thanks for your comment. I'm still undecided:
One of these two should be "the default generator for mirage-crypto-rng-async/mirage-crypto-rng-unix/mirage-crypto-rng-eio/mirage-crypto-rng-lwt". We leave fortuna for mirage-crypto-rng-mirage (and of course, others can explicitly request it). Also, mirage-crypto-rng-miou stays as is with its Pfortuna. Now, (a) still unclear whether /dev/urandom or getentropy should be the default and (b) indeed it is a breaking API change... but neither getentropy nor /dev/urandom requires cooperation from the scheduler, so we can just put all of it into mirage-crypto-rng (.unix), and deprecate the old We can provide both devurandom and getentropy based RNGs, and the clients can pick/set their default. Still the question is which to have as default (since 99.9999% of clients won't change the default). |
If performance is the only concern, then a fallback mechanism could be used. This way the default would be thread-safe, and if you've got urandom then also reasonably fast. And perhaps mark Fortuna as deprecated with a warning/alert that it is not thread safe.
Although looking at how mirage-crypto-rng ended up being used in our own application removing Fortuna would be a breaking change:
Also because currently there isn't a default generator, then even if mirage-crypto-rng is made safe by default, applications would still need to be changed to be thread-safe, but thats fine if a deprecation mechanism is used that warns when a thread-unsafe generator is used with unix.
Yes |
Provide guidance to use these by default, document that Fortuna is not thread-safe. As suggested in mirage#249
Dear @edwintorok, thanks again for your discussion here. I opened #250 with your suggestions. I'd be happy if you could review that PR and tell whether that is fine? The |
Provide guidance to use these by default, document that Fortuna is not thread-safe. As suggested in mirage#249
I don't see any warnings about thread safety in
mirage-crypto-rng
, but looking at the implementation:And there various changes to mutable fields without locks, so I think there is a chance that running
Mirage_crypto_rng.generate
would generate the same random numbers in 2 different threads, which would be quite bad (or at least I can't prove that it wouldn't).If the library is not meant to be thread safe, please add a warning to the .mli
This may not be a problem for mirage (which uses Lwt, so typically single-OCaml-threaded, and not thread safe), but is a problem for multi-threaded OCaml code that would want to use Mirage_crypto_rng.
The text was updated successfully, but these errors were encountered: