[math-fun] OK, I programmed my proposed psu-random generator....
========= Summary results of BigCrush ========= Version: TestU01 1.2.3 Generator: warren/32/4 Number of statistics: 160 Total CPU time: 06:19:48.03 The following tests gave p-values outside [0.001, 0.9990]: (eps means a value < 1.0e-300): (eps1 means a value < 1.0e-15): Test p-value ---------------------------------------------- 7 CollisionOver, t = 7 3.1e-4 ---------------------------------------------- All other tests were passed (This is with a revised version of Warren's code -- he found a bug -- and for a 32-bit machine, with 4 words of state.) --does it provide a description of this test? I'm not sure whether to believe this is really a test failure; if BigCrush does 160 tests we'd expect 0.32 tests to have p outside [0.001, 0.999]. E.g. they say in the BigCrush paper http://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf that the AES cryptosystem failed I think the same CollisionOver randomness test min did, with even worse p=9.2e-5, but then on re-run with different seed it easily passed it, so the failure was not genuine. They also say a "clear failure" is min(p,1-p)<10^(-10). For example "unix random 32" clearly failed 101 different Crush tests... Among generators listed in the BigCrush paper as passing all tests the fastest included ran2, MRGk5-93, Marsa-LFIB4, SuperDuper64, KISS99, Brent-xor4096s, LFib(2^64 ,55, 24,*), and ISAAC. Of those, the four best were probably Marsa-LFIB4, KISS99, Brent-xor4096s, LFib(2^64 ,55, 24,*). Note that all these 4 best generators were invented by Marsaglia and/or Brent. They are definitely the Kings. MarsaLFIB4 is: x[i] = (x[i-55] + x[i-119] + x[i-179] + x[i-256]) using a 256-element state vector and (mod 2^power) addition (any of the + may be replaced by -). Very fast and simple. I just timed it versus my own rand gen MarsaLFIB4: 2.83 sec WarrenRand: 13.23 sec and I'm 4.7X slower. However... it depends on your computer. I presume they weren't lying in the prizewinning paper when they claimed they had the fastest crushproof generators. Only reason MarsaLFIB4 can be slower than way-more-complicated PRNGs like in http://www.thesalmons.org/john/random123/papers/random123sc11.pdf is because of the memory use. With modern processors it pays to do more computing if you touch less memory. Incidentally if you re-run with only 3 state words (or surely if only 2) my randgen will presumably fail more tests, and it would be interesting to know the descriptions of the failed tests so that one then can attempt to strengthen the randgen.
[me:]
---------------------------------------------- 7 CollisionOver, t = 7 3.1e-4 ---------------------------------------------- All other tests were passed
[Warren:]
--does it provide a description of this test?
It's like Marsaglia's OPSO test -- generate lots of numbers, look at all the d-tuples for some d, chop the d-cube into some number of cells such that we expect some collisions but not all that many, and look at the actual distribution. I think I remember from the Diehard paper that this is quite a stringent test and some otherwise successful RNGs fail it. I'm re-running BigCrush to see whether it produces another low p-value on this test, or anything else out of the ordinary. Actually, I think it's already done the CollisionOver tests. There are two with parameter t=7; I don't know exactly what the difference between them is; it looks as if the first one failed before. In the second run, the first one was clearly OK and the second one produced p=0.07, small enough that I'd be a bit suspicious if it were the same test as failed first time around. So, I ran another 5 each of the two CollisionOver tests with t=7. The resulting p-values: 0.76 0.66 0.30 0.59 0.25 0.22 0.19 0.89 0.32 0.13 which don't seem like evidence of any sort of problem. So the 3.1e-4 result looks like it was a coincidence.
I'm not sure whether to believe this is really a test failure; if BigCrush does 160 tests we'd expect 0.32 tests to have p outside [0.001, 0.999].
Yes. (The right thing is probably to do lots of tests, look at their p-values, and test *those* against a uniform distribution. Or something like that. See also: http://www.archim.org.uk/eureka/27/faking.html .)
Among generators listed in the BigCrush paper as passing all tests the fastest included ran2, MRGk5-93, Marsa-LFIB4, SuperDuper64, KISS99, Brent-xor4096s, LFib(2^64 ,55, 24,*), and ISAAC. Of those, the four best were probably Marsa-LFIB4, KISS99, Brent-xor4096s, LFib(2^64 ,55, 24,*). Note that all these 4 best generators were invented by Marsaglia and/or Brent. They are definitely the Kings.
Yup, very smart people.
Only reason MarsaLFIB4 can be slower than way-more-complicated PRNGs like in http://www.thesalmons.org/john/random123/papers/random123sc11.pdf is because of the memory use. With modern processors it pays to do more computing if you touch less memory.
Very much so. Though for modest amounts of state (as for MarsaLFIB4) the RNG state is much much smaller than L1 cache and it seems like if you're using the RNG enough to care how fast it is it'll (1) stay in L1 and (2) not occupy enough of it to slow everything else down much. Unless you're using lots of independent RNGs at the same time, as in that paper, but that's kinda specialized. (The paper doesn't in fact seem to say anything about MarsaLFIB4, and I beg leave to doubt whether MarsaLFIB4 is in fact slower than the RNGs in that paper unless you're using lots of them in parallel.)
Incidentally if you re-run with only 3 state words (or surely if only 2) my randgen will presumably fail more tests, and it would be interesting to know the descriptions of the failed tests so that one then can attempt to strengthen the randgen.
Please feel free to do the tests. :-) -- g
participants (2)
-
Gareth McCaughan -
Warren Smith