[math-fun] biased to unbiased coin flips in real life
So I assume you know the old problem of "convert a sequence of biased coin flips (with unknown bias) into a (not necessarily same length) sequence of unbiased coin flips". I recently learned two interesting things about this problem: it was apparently first posed (& of course solved) by Johnny von Neumann and that it is used in certain Intel chips as part of a hardware random number generator process. (See http://www.cryptography.com/resources/whitepapers/IntelRNG.pdf for details.) While looking at this, I wondered if there was a more efficient scheme that processed more than two bits at a time. A little bit of thought revealed that there was, for example, this 4 bit code: 0000, 1111, 1100, 0011 -> output nothing 0001, 1110, 0101 -> 00 0010, 1101, 0110 -> 01 0100, 1011, 1001 -> 10 1000, 0111, 1010 -> 11 This is a win because the 2 bit code would only output 1 bit for sequences like 0001, while here we output 2 bits. In general, good codes can be obtained by mapping an n bit block with k 1's in it to an output with round_down( log_2( choose(n,k) ) ) bits, but these will only be more efficient than the two bit code for all biases when n is a power or two. The reason is that for very low or very high biases, the efficiency depends most on what you do when there is only one 0 or 1 in the input block. The two bit code will always output one bit in this circumstance, while most of the others (including those for all odd n) will sometimes output no bits. In fact, if we define the efficiency of a code as the ratio of the expected output length to the input length, then the efficiency of the 2^m block codes goes to 1 as m goes to infinity. The worst case is when p is very small (or very large); for these the efficiency is approximately p m. On the other hand, if the bias is known to be near 0.5, then odd length codes seem to be more efficient. (The length 7 code has efficiency 29/56, which is not beaten by any code of length less than 15.) I'm a little bit curious as to why Intel didn't implement a more efficient code. One guess is that they didn't know about them. Another is that the 2 bit code was good enough so as to make wasting silicon on more a complicated code not worth it. Still another possible reason is that the bias of the input coin flipper isn't exactly constant, but has (say) a small linear trend over time which is small enough to ignore over 2 bits but maybe not over longer lengths. -Thomas C
Wed, 09 Apr 2003 13:20:40 -0400 Thomas Colthurst <tcolthur@bbn.com> So I assume you know the old problem of "convert a sequence of biased coin flips (with unknown bias) into a (not necessarily same length) sequence of unbiased coin flips". I recently learned two interesting things about this problem: it was apparently first posed (& of course solved) by Johnny von Neumann and that it is used in certain Intel chips as part of a hardware random number generator process. (See http://www.cryptography.com/resources/whitepapers/IntelRNG.pdf for details.) ... I'm a little bit curious as to why Intel didn't implement a more efficient code. One guess is that they didn't know about them. Another is that the 2 bit code was good enough so as to make wasting silicon on more a complicated code not worth it. Still another possible reason is that the bias of the input coin flipper isn't exactly constant, but has (say) a small linear trend over time which is small enough to ignore over 2 bits but maybe not over longer lengths. I recall hearing a talk by Dave Gifford on using this process to produce unbiased random bits from a hardware source. I think he wrote a paper on it, circa 1980. I think it was published as an MIT LCS Tech Report. It may have an answer to why 2 bits are preferable to n > 2 (if, in fact, 2 *is* better).
A snake is a series of unit length matches, laid head to tail at multiples of 30 degrees, never touching other matches. How big of a snake can be in a side-2 square? Oyvind Tafjord let a program run for a few days, and it give some interesting results. It turns out there are two length-20 snakes. {{1, 3, 10, 7, 7, 4, 4, 1, 8, 1, 10, 5, 10, 5, 10, 1, 3, 5, 12, 9}, {1, 4, 9, 7, 4, 11, 4, 11, 4, 2, 10, 10, 7, 7, 4, 4, 1, 10, 5, 10}} You can think of these numbers as a series of clockface headings. There are 64 length-19 snakes. I give them here in doudecimal. 114477A31A5A5901496, 114477A31A703A76307, 114477A3A3A703A3470, 114477A3A30707075A1, 114477A307A3A3A3470, 114477A30703A7075A1, 114477A30A5A5A13470, 114477AB2494941BA72, 114477AB272B4727941, 114477AB274B4B4BA72, 114477AB4B272727941, 114477BB4B472BA764B, 11447708B418181461A, 1146277A318A3A3A753, 1162477AB164B4B479B, 1347B692B69114477A3, 1349692B69114477A31, 1349692B69114477A30, 13A7734941196B29691, 13A7734941196B29692, 13A774411892964B418, 13A77441196B2969430, 13A77441199494169A1, 13A7744181A5A5A1350, 1448727AA25092529B4, 14581A5A5511AA7735A, 1461AA7743A3A3A7074, 149470707427AA11449, 1494946BA5A11447A3A, 1494949427AA1144961, 1496169427AA1144961, 14969114477A3192969, 14969114477A31A5A59, 1496911447701969416, 149727272B4BA774418, 14974B4B42AA77441A5, 14974B4B42AA7744027, 14974B4B4727AA11449, 14974B4B40A77441A5A, 15692B69114477A30B4, 15692B69114470A5694, 1585038511A5AB774B2, 161467A3A3A3A707494, 161A774411970707427, 1625A5A83511A807722, 1663A3A05377A081166, 168303850377A3A9116, 168305830377AA1153A, 168305058511AA7735A, 16843A3A05377AA1166, 1684411AA56B63181A5, 1684411AA636B6B8350, 1684411AA636B836B92, 169014961479411AA76, 21AA774B14949521A58, 21AA774B4B472B4BA72, 2478B4181884411AA72, 25A527AA11449A72529, 2727A5A03A3A7744118, 277AA1149614974B418, 27A5A0383277AA11538, 27AA11448B638B6A125, 27AA114947070742058, 2B477AA115830585AB2 These snakes are interesting creatures. I'm working on making scalable pictures of all of them. All of them have interesting properties. 15692B69114477A30B4, for example, goes in 11 of the 12 possible directions. I plan on looking at different angles, and different bounding shapes. --Ed Pegg Jr, www.mathpuzzle.com
participants (3)
-
Ed Pegg Jr -
Michael B Greenwald -
Thomas Colthurst