The Blum Blum Shub "provably unpredictable" psu-random number generator is just repeated squaring modulo (p*q) where p and q are each large primes that each are 3 mod 4. Output just one bit per iteration, and the bits are provably hard to predict in the backwards-time direction, in the sense that any algorithm that could predict them would also be able to factor p*q. Well, seems to me, abstracting what is going on and regarding the prime factoring stuff as a red herring... that what we have here, is information loss. That is, squaring (mod p*q) is a many-to-one map. It loses information. If you try to go backward in time, there are several "correct answers"... except only one of them is itself a square (i.e. could have been the result of a forward iteration) and hence the "really correct" answer. So. Let us contrast iterating a function x:=F(x) in two cases: where F is random bijection, and where F is a random function, both having both domain and range that is a universe of cardinality U. Random bijections F: After iterating F a total of T times in a U-element universe from a random starting point, the chance the last step creates a cycle is 1/(U+1-T). So the chance that we evade creating a cycle during steps 1,2,3,...,T is (U-1)/U * (U-2)/(U-1) * ... * (U-T)/(U+1-T) = (U-T)/U so the chance that we are in a giant cycle missing <=M elements of the universe is M/U. In other words, in the best 1% of such experiments, our cycle will miss <=1% of the universe. Random functions F: After iterating F a total of T times in a U-element universe from a random starting point, the chance the last step creates a cycle by returning to a previous, is T/U. So the chance that we evade creating a cycle during steps 1,2,3,...,T is (U-1)/U * (U-2)/U * (U-3)/U * ... * (U-T)/U which is <= (1-(T+1)/(2*U))^T by Jensen's inequality, and >= (1-T/U)^T by monotonicity of the terms. Hence if U is large, the period length is more than 63% likely to be below sqrt(2*U), but also more than 50% likely to be above sqrt(U)*ln2=0.69*sqrt(U). The chance the period exceeds K*sqrt(2*U) is less than exp(-K). So the best 1% of experiments should have periods exceeding about ln(100)*sqrt(U) which is about 4.6*sqrt(U). but the worst among the best 1% should not exceed ln(100)*sqrt(2*U) which is about 6.5*sqrt(U). Point I want to make is: iterating a random function F which is NOT bijective loses information, in fact about 1 bit of information per iteration. One might therefore expect that if we output 1 bit from each iterate, then essentially ANY such iteration ought to be "hard to predict" in the backwards time direction, by any algorithm which doesn't work hard enough to know about which elements of the universe are cycle-members and which aren't. In the Blum-Blum-Shub case, the "cycle members" were the "squares" and recognizing a cycle member with 51% accuracy was equivalent to recognizing squares, which was computationally polynomial-time-equivalent to factoring p*q. But I think that in general, where F is implemented using a boolean circuit, say, distinguishing cycle members from not is going to be a hell of a lot harder than integer factoring. Integer factoring is soluble in polynomial time if you own a quantum computer. But cycle-member recognition should be beyond quantum assistance; since information-loss is generally speaking a quantum-computer killer. Essentially the only way I know to test cycle-membership for X is just to iterate F until it cycles, then ask whether X was on that cycle. That takes exponential expected time for random functions F, and also empirically takes exponential expected time for plenty of actual F. Now let me note that the "boolean circuit cycle recognition problem" is COMPLETE. PROBLEM STATEMENT: Given a boolean circuit F that inputs N bits and outputs N bits; and given an N-bit string X; question is: "is X a cycle member?" Fairly obviously, any oracle capable of answering that with >=51% accuracy (for both false-positive and false-negative error rates below 49%), would also be capable of solving cycle membership for any function F computable in polynomial time, and I think also ought to be capable of 99.9999% accuracy for "most" F. So regard this as a new (?) computational complexity class. Can we now prove interesting/useful subclasses of it, remain complete? (Such as, as a first step, restricted classes of circuits?) -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)