[math-fun] provable pseudo-randomness viewed as about information loss
The Blum Blum Shub "provably unpredictable" psu-random number generator is just repeated squaring modulo (p*q) where p and q are each large primes that each are 3 mod 4. Output just one bit per iteration, and the bits are provably hard to predict in the backwards-time direction, in the sense that any algorithm that could predict them would also be able to factor p*q. Well, seems to me, abstracting what is going on and regarding the prime factoring stuff as a red herring... that what we have here, is information loss. That is, squaring (mod p*q) is a many-to-one map. It loses information. If you try to go backward in time, there are several "correct answers"... except only one of them is itself a square (i.e. could have been the result of a forward iteration) and hence the "really correct" answer. So. Let us contrast iterating a function x:=F(x) in two cases: where F is random bijection, and where F is a random function, both having both domain and range that is a universe of cardinality U. Random bijections F: After iterating F a total of T times in a U-element universe from a random starting point, the chance the last step creates a cycle is 1/(U+1-T). So the chance that we evade creating a cycle during steps 1,2,3,...,T is (U-1)/U * (U-2)/(U-1) * ... * (U-T)/(U+1-T) = (U-T)/U so the chance that we are in a giant cycle missing <=M elements of the universe is M/U. In other words, in the best 1% of such experiments, our cycle will miss <=1% of the universe. Random functions F: After iterating F a total of T times in a U-element universe from a random starting point, the chance the last step creates a cycle by returning to a previous, is T/U. So the chance that we evade creating a cycle during steps 1,2,3,...,T is (U-1)/U * (U-2)/U * (U-3)/U * ... * (U-T)/U which is <= (1-(T+1)/(2*U))^T by Jensen's inequality, and >= (1-T/U)^T by monotonicity of the terms. Hence if U is large, the period length is more than 63% likely to be below sqrt(2*U), but also more than 50% likely to be above sqrt(U)*ln2=0.69*sqrt(U). The chance the period exceeds K*sqrt(2*U) is less than exp(-K). So the best 1% of experiments should have periods exceeding about ln(100)*sqrt(U) which is about 4.6*sqrt(U). but the worst among the best 1% should not exceed ln(100)*sqrt(2*U) which is about 6.5*sqrt(U). Point I want to make is: iterating a random function F which is NOT bijective loses information, in fact about 1 bit of information per iteration. One might therefore expect that if we output 1 bit from each iterate, then essentially ANY such iteration ought to be "hard to predict" in the backwards time direction, by any algorithm which doesn't work hard enough to know about which elements of the universe are cycle-members and which aren't. In the Blum-Blum-Shub case, the "cycle members" were the "squares" and recognizing a cycle member with 51% accuracy was equivalent to recognizing squares, which was computationally polynomial-time-equivalent to factoring p*q. But I think that in general, where F is implemented using a boolean circuit, say, distinguishing cycle members from not is going to be a hell of a lot harder than integer factoring. Integer factoring is soluble in polynomial time if you own a quantum computer. But cycle-member recognition should be beyond quantum assistance; since information-loss is generally speaking a quantum-computer killer. Essentially the only way I know to test cycle-membership for X is just to iterate F until it cycles, then ask whether X was on that cycle. That takes exponential expected time for random functions F, and also empirically takes exponential expected time for plenty of actual F. Now let me note that the "boolean circuit cycle recognition problem" is COMPLETE. PROBLEM STATEMENT: Given a boolean circuit F that inputs N bits and outputs N bits; and given an N-bit string X; question is: "is X a cycle member?" Fairly obviously, any oracle capable of answering that with >=51% accuracy (for both false-positive and false-negative error rates below 49%), would also be capable of solving cycle membership for any function F computable in polynomial time, and I think also ought to be capable of 99.9999% accuracy for "most" F. So regard this as a new (?) computational complexity class. Can we now prove interesting/useful subclasses of it, remain complete? (Such as, as a first step, restricted classes of circuits?) -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
Here is a better assessment of the "information loss" when a random function F is iterated in some universe of large cardinality U. If Y=F(X) for k different pre-images X, then the probabilities of different k should essentially be a Poisson distribution with expectation=1, Prob(k) = exp(-1) / k! whose entropy is -Sum[ exp(-1)/k! * log(exp(-1)/k!), k=0..infinity ] / log(2) = 1.882489 bits except that if we restrict to k>0 only, then the probability is Prob(k) = (1-1/e) / k! whose entropy is -Sum[ (1-1/e)/k! * log((1-1/e)/k!), k=1..infinity ] / log(2) = 1.47443 bits So, I presume that means that each iteration, we lose about 1.47443 bits of information. In contrast a perfect coin flip has Prob=1/2 and entropy=1 bit. Re my conjecture about the hardness of cycle-membership (and cycle cardinality presumably also is hard) here is a specific case. Let x be a 64-bit unsigned integer with arithmetic mod 2^64. What is the cycle length of x := Rotate(3141592653589793239*x mod 2^64, 32); starting from x=1? It might be beyond the ability of humankind to answer this. And that example was not even information-losing. The answer is somewhere in [2^45, 2^64] probably toward the high end. A 128-bit info-losing example would be x := Rotate(31415926535897932384626433832795028841*x mod 2^128, 64)-x mod 2^128; and what is the period of that starting from x=12345678987654321? Is x=12345678987654321 a cycle member? Can anybody see how to answer such questions any faster than about 2^64 operations? Seems like they are harder than hell, and dwarfs the hardness of factoring a 128-bit or even a 256-bit number. -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
On Thu, Mar 31, 2016 at 11:31 AM, Warren D Smith <warren.wds@gmail.com> wrote:
Re my conjecture about the hardness of cycle-membership (and cycle cardinality presumably also is hard) here is a specific case. Let x be a 64-bit unsigned integer with arithmetic mod 2^64. What is the cycle length of x := Rotate(3141592653589793239*x mod 2^64, 32); starting from x=1?
If there's a fast way to compute the nth iteration of a function, then a quantum computer can find the cycle length in polynomial time. -- Mike Stay - metaweta@gmail.com http://www.cs.auckland.ac.nz/~mike http://reperiendi.wordpress.com
I wonder if the children of computer scientists play a game in which n-1 children sit in a circle and the nth goes around the outside, tapping players on the shoulder and saying "Blum" a pseudorandom nimber of times before shouting "Shub!". Jim Propp On Thursday, March 31, 2016, Warren D Smith <warren.wds@gmail.com> wrote:
The Blum Blum Shub "provably unpredictable" psu-random number generator is just repeated squaring modulo (p*q) where p and q are each large primes that each are 3 mod 4. [etc.]
participants (3)
-
James Propp -
Mike Stay -
Warren D Smith