Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNG concerns #15

Open
notwa opened this issue Sep 30, 2020 · 2 comments
Open

RNG concerns #15

notwa opened this issue Sep 30, 2020 · 2 comments

Comments

@notwa
Copy link

notwa commented Sep 30, 2020

RecipesAtHome appears to be using C's standard library for random number generation, which is sketchy for a couple of reasons:

  • the underlying implementation could be arbitrarily bad. it's almost certainly an LCG, which isn't inherently bad, but some internal constants work better than others. there's quite a few of them out there. Microsoft (msvcrt.dll) still seems to use (x=x*214013+2531011)>>16 in particular.
  • it probably uses a relatively small amount of internal state. the period of the lower bits of an LCG can be improved by mixing an internal, larger state into a smaller output integer type.
  • it's not thread-safe, unless OMP is doing something fancy here.

additionally, there's a bit of bias in the way the RNG is being used. using a non-power-of-two modulus with a power-of-two-based RNG creates bias, unless the result is suitably re-rolled. there's also an instance of rand() % 100 < 50 where a simple rand() & 1 would suffice. (I suppose you were playing with the ratio at some point)

I think what this means is… edit: I'm looking at the code again with a clearer mind, and the impact should be minimal since handleSelectAndRandom typically falls through which does not invoke the RNG, assuming default settings.

anyway, I've thrown together an ad-hoc patch utilizing the PCG C library (on github) (Apache 2.0/MIT dual-licensed) that should resolve these issues. this library is still LCG-based, but: the constants should be sound, it implements output mixing, it supports up to 128-bits of internal state (what I've chosen), and it performs re-rolls for bounded queries. there are plenty of other libraries out there; this was simply the first to catch my attention. I didn't make this a pull request since my patch is kind of a hack.

@Amphitryon0
Copy link
Contributor

As far as I'm concerned, randomise is a legacy setting that we should eventually stop supporting. The built-in randomization has a separate state for each thread, but I'm curious if you think the period of the lowest bit of the output is likely to ever coincide exactly with starting a new branch. That is the only time that I think periodicity will matter, but it will be catastrophic if it occurs (effectively wasting a thread until the program is restarted).

With that said, if there is no locking or other performance difference for a pull request with a different PRNG, I would see little problem with merging it in if it avoids that issue.

@techsy730
Copy link
Contributor

techsy730 commented Nov 8, 2021

In my current dev branch, I successfully got the C++ random number generators working (even though this is a C program, I have a very lightweight wrapper for it).

I tried quite a multitude of RNG implementations, both library standards and third party, and std::linear_congruential_engine seemed to be one of the very few that was faster then rand() for RecipesAtHome (and it can be made thread safe pretty trivially)

If it doesn't bloat the final binary by much, I'll try getting that in a pull requestable state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants