On 4/16/07, James Propp <propp@math.wisc.edu> wrote:
This topic provides me with a good opportunity to raise a question that has been on my mind for a year or two:
Can anyone find an explicit construction of an infinite sequence S of 0's and 1's with the property that for any finite bit-string s, the relative frequency with which s occurs in the first N bits of S converges to (1/2)^(length of s) with error that falls like (log N)/N?
Note that the Champernowne sequence A030190 has error 1/(log N). An explicit construction of Alon, Kohayakawa, Mauduit, Moreira and R\"odl allows one to achieve error 1/N^c for any exponent c greater than 2/3 (which beats what one would get from a random sequence of bits). [*Less than* 2/3? Gareth McCaughan]
Jim Propp
Suppose we lower our sights, and just try to improve the dismal performance of the Champernowne sequence. Here's a couple of plausible-looking upgrades: firstly, instead of cheerfully dropping initial zeros, at iteration n we append a block in which every binary n-bit word in full is concatenated; secondly, instead of the conventional binary ordering, we arrange these words in Gray-code order. Also I've just noticed that my program reverses the conventional digit-order within a word: least significant bit on the left! The blocks of this "Gray-Champernowne" sequence up to span n = 4 are: 0 1 00 10 11 01 000 100 110 010 011 111 101 001 0000 1000 1100 0100 0110 1110 1010 0010 0011 1011 1111 0111 0101 1101 1001 0001 At first sight these improvements appear singularly unsuccessful: the frequency of words of size n in block n still varies wildly from 1 to 2n-1. But a closer look reveals the astonishing fact that the frequencies of all smaller sizes m < n in block n are constant! For example, continuing up to span n = 10, every word of length m = 5 occurs in the final (cyclic) block 320 times; over the concatenated initial segment of the complete sequence, the frequencies vary only between 569 and 583. Of course, to satisfy Jim's constraint, the variation in frequency throughout each block must be estimated, rather than just at the boundaries ... Fred Lunnon