[math-fun] Gray-Champernowne sequence [was Path Question]
On 4/16/07, James Propp <propp@math.wisc.edu> wrote:
This topic provides me with a good opportunity to raise a question that has been on my mind for a year or two:
Can anyone find an explicit construction of an infinite sequence S of 0's and 1's with the property that for any finite bit-string s, the relative frequency with which s occurs in the first N bits of S converges to (1/2)^(length of s) with error that falls like (log N)/N?
Note that the Champernowne sequence A030190 has error 1/(log N). An explicit construction of Alon, Kohayakawa, Mauduit, Moreira and R\"odl allows one to achieve error 1/N^c for any exponent c greater than 2/3 (which beats what one would get from a random sequence of bits). [*Less than* 2/3? Gareth McCaughan]
Jim Propp
Suppose we lower our sights, and just try to improve the dismal performance of the Champernowne sequence. Here's a couple of plausible-looking upgrades: firstly, instead of cheerfully dropping initial zeros, at iteration n we append a block in which every binary n-bit word in full is concatenated; secondly, instead of the conventional binary ordering, we arrange these words in Gray-code order. Also I've just noticed that my program reverses the conventional digit-order within a word: least significant bit on the left! The blocks of this "Gray-Champernowne" sequence up to span n = 4 are: 0 1 00 10 11 01 000 100 110 010 011 111 101 001 0000 1000 1100 0100 0110 1110 1010 0010 0011 1011 1111 0111 0101 1101 1001 0001 At first sight these improvements appear singularly unsuccessful: the frequency of words of size n in block n still varies wildly from 1 to 2n-1. But a closer look reveals the astonishing fact that the frequencies of all smaller sizes m < n in block n are constant! For example, continuing up to span n = 10, every word of length m = 5 occurs in the final (cyclic) block 320 times; over the concatenated initial segment of the complete sequence, the frequencies vary only between 569 and 583. Of course, to satisfy Jim's constraint, the variation in frequency throughout each block must be estimated, rather than just at the boundaries ... Fred Lunnon
Thu, 19 Apr 2007 18:12:21 +0100 "Fred lunnon" <fred.lunnon@gmail.com>
On 4/16/07, James Propp <propp@math.wisc.edu> wrote:
This topic provides me with a good opportunity to raise a question that
has
been on my mind for a year or two:
Can anyone find an explicit construction of an infinite sequence S of 0' s and 1's with the property that for any finite bit-string s, the relative frequency with which s occurs in the first N bits of S converges to (1/2)^(length of s) with error that falls like (log N)/N?
Note that the Champernowne sequence A030190 has error 1/(log N). An explicit construction of Alon, Kohayakawa, Mauduit, Moreira and R\"odl allows one to achieve error 1/N^c for any exponent c greater than 2/3 (which beats what one would get from a random sequence of bits). [*Less than* 2/3? Gareth McCaughan]
Jim Propp
Suppose we lower our sights, and just try to improve the dismal performance of the Champernowne sequence. Here's a couple of plausible-looking upgrades: firstly, instead of cheerfully dropping initial zeros, at iteration n we append a block in which every binary n-bit word in full is concatenated; secondly, instead of the conventional binary ordering, we arrange these words in Gray-code order. Also I've just noticed that my program reverses the conventional digit-order within a word: least significant bit on the left! ... At first sight these improvements appear singularly unsuccessful: the frequency of words of size n in block n still varies wildly from 1 to 2n -1. But a closer look reveals the astonishing fact that the frequencies of all smaller sizes m < n in block n are constant! I have some naive/ignorant questions while trying to understand this. First, what does this construction buy us that the deBruijn sequence doesn't? I see how the deBruijn sequence gets us an even distribution of all n-bit sequences in the first N'=2^n (actually, n-1+2^n bits, if we want to be picky), but I don't see how we know what the distribution of n-bit substrings will be *after* that (I'm missing the part that even tells us that the error (|d(N)| as defined in the earlier mail, but restricted to only n-bit substrings) even drops after those first N' bits). This construction seems to try to do the same thing, but in sequences that are longer by roughly a factor of n. From this I am guessing that I am missing the point of this construction. Second, I am wondering about the intuition behind the error metric that was outlined in James Propp's email. If I understand it correctly, the idea is to find a construction for an infinite string S in which the "distributional error" [I don't know any term (let alone a *better* term) for the error computation used] over substrings of *all* sizes n < N decreases asymptotically fastest (I think someone said that log(N)/N is optimal, no?) with the length of the string prefix. So I'm wondering whether there's any intuition about why we might care about all length substrings equally, even substrings pretty close to N in length, rather than focusing on strings that are small compared to N. For a concrete example, let's say we define a "good enough" frequency distribution (to pick a random example, perhaps the distribution is "good enough" if the frequency of all strings of size n lies between 1/(k (log n) 2^n) and k (log n)/(2^n) for some constant k). It then makes sense to me to define a function F_S(n) over a string S that returns an N' for each n such that for all prefixes of S with length N > N', the distributional error for substrings of size n is "good enough". Then maybe we want to find a string S such that F_S(n) increases slowly (for S = the (rotated) deBruijn sequence, and assuming that substrings of size n behave well even after their initial 2^n deBruijn prefix (and as I said, I have to confess that I don't see how we can guarantee that they are evenly distributed after that), then F_S(n) should grow like O(2^n), while for S being the Champernowne-Gray string I think F_S(n) would grow like O(n 2^n). (But I'm out of my depth here, I fear). My question is really whether the two error metrics are related? I assume they're not, why is the first more interesting than the 2nd? Is my confusion just a symptom of my not understanding the basic thing we are trying to do? My apologies if these questions should have obvious answers, or if I'm just grabbing hold of this problem from the wrong end. I'm just feeling dense. Thanks.
James Propp wrote:
Dan Asimov asked:
May I ask for a clarification of exactly what "error that falls like (log N)/N" means?
Good question! What I think I mean is that if we let k denote the length of s and define f(N) as the number of occurrences of s in the first N bits of S, then the discrepancy d(N) = f(N) - (N-k+1)/2^k satisfies |d(N)| < C (log N) / N for some constant C, when N is sufficiently large.
Note that I used (N-k+1)/2^k instead of N/2^k since a bit string of length k only gets N-k+1 chances to occur in a bit string of length N. ... Oops: make that "|d(N)| < C (log N)"!
[I think it was right the first time, but never mind --- I'll run with the definition below!]
(Note that the difference between (N-k+1)/2^k and N/2^k is constant and hence less than log N for N sufficiently large; so another way to state the desired inequality is |f(N)/N - 1/2^k| < C (log N) / N.)
I hope I have that right now...
Sadly, in spite of the promising full-block frequencies, Gray-Champernowne examined under this microscope appears to better original Champernowne by only by a constant factor --- and in particular to be noticeably worse(ly?) distributed on a local basis than random. Sorry about that, everybody. Time to go away and crawl back under my stone, I reckon! Fred Lunnon
About 10 years ago I became interested in the following problem: consider the set of all possible (binary, say) strings of length m, and a given (shorter) word w. Denote by g(k, m) the number of strings in which w occurs as a factor exactly k times. How can we compute g(k, m) efficiently? As m -> oo, g(k, m) appropriately scaled converges to an analytic function which is approximately the normal distribution. What does this function look like in detail: in particular, how does its standard deviation depend on w? To my surprise, almost nothing seemed to be known about these problems; furthermore, the probability journals proved utterly uninterested in any exploration of them! I still don't understand why such apparently natural questions remain so neglected. A related question concerned the distribution of the intervals between successive occurrences of w in a random sequence, which offers an alternative approach to measuring the local deviation of a sequence from oo-distribution, avoiding any involvement of infinite sums or the number of terms N. The relevant results for a binary random sequence are as follows: the maximum standard deviation of intervals between successive occurrences equals sqrt(2^m*(3*2^m-2*m-3)), for the word w = 0^m (or 1^m); the minimum equals sqrt(2^m*(2^m-2*m+1)), for w = 10^{m-1} (or any word lacking self-overlap). For example if m = 5, max, min = 51.54, 27.13 for 00000, 10000 resp. Any asymptotic reduction below these deviations is "better than random". In contrast, typical values for a block of Gray-Champernowne were max, min = 111.8, 39.36 (uuurghhh!) Fred Lunnon
I found this paper about the Klein graph quite nice. http://www.liga.ens.fr/~deza/withFowler/AdditionKlein.pdf They call it the Plumber's Nightmare Graph, which is an interesting name. --Ed Pegg Jr
participants (3)
-
Ed Pegg Jr -
Fred lunnon -
greenwald@cis.upenn.edu