Re: [math-fun] [EXTERNAL] Stupid question about hash functions
The problem with the probabilities output from the permutation function is that they are "too" good: the probabilities themselves are ok, but the standard deviations are too small. Now the real question is how many bits is this worth, and the answer might be "not very many". The information content of a permuter is about n*logn-n = n*[(logn)-1] bits, whereas the information content of a random function is n*logn. So the difference is perhaps n bits. If logn >> 1, then this isn't going to hurt very much, whereas the existence of collisions will hurt a lot. At 02:30 PM 2/25/2014, Cordwell, William R wrote:
Interesting idea. So, if we project a permutation, the range probabilities are certainly uniformly distributed. Not quite so with a projected random function (what the distribution actually is sounds like another interesting question). But if we start with, say {0,1}^256 as the domain and range (permutation output) of our input space, and if we choose something plausible to project to, such as {0,1}^40, then there are 2^216 "unprojected" points for every projected point, from which to sample. If we also sample 2^40 times, reaching into the bag and picking out values, it seems awfully close to sampling with replacement, so one might expect the sample to look like a random function output on {0,1}^40. I suspect that it gets closer to that behavior, the larger the initial value for n.
If we could claim that permuting with a random permutation first, then truncating is a hash function, vs. truncating the result of a random function (which is also a hash function), then I think that we might be done (left-over hash lemma?).
-----Original Message----- From: math-fun-bounces@mailman.xmission.com [mailto:math-fun-bounces@mailman.xmission.com] On Behalf Of Henry Baker Sent: Tuesday, February 25, 2014 2:00 PM To: math-fun Subject: Re: [math-fun] [EXTERNAL] Stupid question about hash functions
Suppose U={0,1}^n. Then we can _project_ h(x) onto some smaller # of bits to compute a smaller hash function. But as we project further & further, the probabilities become more & more suspect, because the expected deviations aren't there.
At 11:47 AM 2/25/2014, Cordwell, William R wrote:
Detect by means other than having enough values so that not having a collision is suspect?
-----Original Message----- From: math-fun-bounces+cordwell=sandia.gov@mailman.xmission.com [mailto:math-fun-bounces+cordwell=sandia.gov@mailman.xmission.com] On Behalf Of Henry Baker Sent: Tuesday, February 25, 2014 12:17 PM To: math-fun@mailman.xmission.com Subject: [EXTERNAL] [math-fun] Stupid question about hash functions
Suppose we have a hash function h(x)=y, x,y in universe U, n=|U|.
(We have a simplified situation in which the domain = range.)
"Collisions" in hash functions are really bad, due to the Birthday Paradox, so we want h to be 1-1, which in this case means a permutation of U.
But permutations are a vanishingly small % of all the functions from U -> U; i.e., n! << n^n
n!/n^n ~ sqrt(2 pi n)/exp(n)
Shouldn't it get easier & easier to detect a non-random function from U -> U as n increases?
Shouldn't this lack of randomness be exploitable?
participants (1)
-
Henry Baker