-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better random number generator #80
Comments
See also #36 |
I've played around with alternative RNG methods a bit, in particular by piping a few twister implementations into some Gaussian generators such as fast-rng-java. Note that in addition to entropy concerns one must also consider speed, as for e.g. monte-carlo applications speed of the generator can be a large factor in overall performance. Using or implementing our own thing in kotlin could also close #36. As mentioned on the youtrack ticket for kotlin.math.Random, Nat Pryce has an implementation of twister in pure kotlin. The author claims that "MersenneTwisterFast achieve[s] well over twice the speed of MersenneTwister. java.util.Random is about 1/3 slower than MersenneTwisterFast." I haven't verified the validity of the output distribution or run it through any statistical test suites, but it seems promising and was where I was going to start for closing out #36. We've also got an implementation of polar form Box-Muller in pure kotlin for normal distributed numbers, although ziggurat would be better. |
A search also turns up a partial implementation of PCG in kotlin. Looks like its already designed to be used as multiplatform for jvm/js. I've not tried it yet though. |
The complete PCG algorithm is basically five lines of code. Here's the whole thing. It's also a lot faster than MT. The only thing that makes java.util.Random slow is that it's thread safe. Other than that, it's completely trivial. The corollary is that the only way any RNG is going to be faster is to give up on thread safety, which isn't necessarily a good idea. Hmmm. Although now that I think about it, you could optimize it for batch generation. If your main use case is filling matrices with random variables, you could acquire a lock, generate all the values, and then release it. That would be very fast. The downside is that generating single random numbers might be slower than using atomics, as java.util.Random does. (But not necessarily, at least if there's no thread contention. Biased locking in the JVM makes that pretty fast.) |
Not necessarily-- I've been able to get very fast performance out of a fork join pool-based rng that is still thread safe. Each call to
To be clear, I'm not against PCG at all. Whatever is the easiest, most correct and fastest solution is good with me. |
Right, I was assuming you wanted a single random sequence. If you're ok with multiple independent generators, you could just use a ThreadLocal to get a separate generator for every thread. The initial seed value provided by the user can be used to initialize a simple RNG (for example, a java.util.Random) which then produces the seeds for the real generators as they're created. |
I'd love to scope out a fully concurrent RNG module (my ultimate goal with #36), where the user is able to configure exactly how concurrent it is. There's a lot of interesting things we could do. For example, we could have (disableable) low-priority background threads generate randomness into a buffer, so the pre-generated numbers are ready for the user thread when it needs them. The user could specify how many workers to spawn and whether or not they should run as an on-demand batch job or async background task. However, we could break this up into smaller tasks. Step 1 is to switch over to our own simple solution, a single-threaded implementation that uses our favored algorithms, perhaps starting with PCG and either Box-Muller/Ziggurat. |
Ok, I can implement that. Regarding thread safety, how about this? The actual generation of random numbers will be in non-threadsafe internal methods, for example |
👍 That seems like a reasonable approach for a first cut, as it would allow us to generate larger matrices without lock spam. I can think of some useful additions (thread-local seeds, generation for |
Random number generation is implemented with java.util.Random. That's a linear congruential generator, which isn't suitable for scientific applications. Or just about any other application where you care how the numbers are distributed. It would be good to replace it with something more robust.
The most popular choice is Mersenne Twister, but that's mainly for historical reasons. It has some statistical weaknesses, and it's generally inferior to newer algorithms. PCG has been getting a lot of attention lately. It's fast, and it passes every statistical test that's been thrown at it. It's also very easy to implement.
The text was updated successfully, but these errors were encountered: