[math-fun] Billiard word counts (was "Interleavings of three arithmetic progressions")

9 Aug 2010

      Called earlier "interleavings of arithmetic progressions", billiard sequences
can be viewed equivalently as generated by discrete approximation to a line
in m-dimensional space; by a weightless ball bouncing against the walls of
an m-dimensional hypercube; or by a point translated continuously through a
suitably partitioned m-dimensional torus T^m.

In each case, we ask for function f_m(n) counting the number of m-symbol words
of length n occurring as factors of possible billiard sequences. Allan Wechsler
posed the case m = 3. I have now had a look at m = 4, for which I find
    f_4(n)  =  1, 4, 16, 64, 244, 856, 2776, 8356, 23032, 59200,
        142624, 324484, 696256, 1422436, 2779900,
    when n = 0,...,14. [OK, now do your worst, Michael!]
Setting n = 14 in the expression log(f(n)/f(n-1)) / log(n/(n-1)) gives 9.04,
suggesting to the credulous optimist that perhaps f_4(n)  =  O(n^9).

The program does its best to avoid inequality analysis by first projecting
each candidate sequence onto the axis hyperplanes, yielding m new (m-1)-symbol
sequences which are necessarily also billiard, and can be validated from a
precomputed list for dimension m-1. If any of these fails, the candidate fails;
and if only one candidate remains after n-th symbols are attached to a valid
(n-1)-length sequence, there is still no need for inequality analysis ---
the ball cannot avoid bouncing next against some wall pair!

The projection criterion is surprisingly effective: without inequality
analysis,
when m = 3 it catches every dud word for n < 10; when m = 4, for n < 12.
I find it frustrating that it can't be beefed up just a little more,
since inequality analysis of those cases remaining slows the Maple program
by a factor of 10.

A (notionally slightly weaker) version of this projection constraint involves
projecting instead onto planes, yielding instead m_C_2 2-symbol sequences,
one on each pair of the m symbols. If a candidate is already known to be valid,
it may be reconstructed from just m-1 of these binary projections --- say,
just those involving symbols (1,2),...,(1,m). But it is already known that
there are O(n^3) of the latter, from which it follows immediately that
    f_m(n)  <=  O(n^(3m-3)).
The numerical results suggest that degree 3(m-1) is actually sharp!

Finally, a taste of the difficulty to be expected in beefing up projection
to achieve sufficiency is provided by the following example: the one-way
infinite (and easily made two-way?), ultimately period-6 ternary word
    1121231123 (112123)^oo
is not billiard (by inequality analysis on its length-10 prefix);
though all three of its projections
    11212112(11212)^oo, 1113113(1113)^oo, 22323(223223)^oo
are binary billiard words (aka Sturmian).

Extraordinarily long impasses are possible: the above's length-31 prefix
    1121231123 1121231121 2311212311 2
may be followed by the foreign length-85 suffix
    3121123121 1231213121 2131212131 2132112132
    1121321132 1211321211 3212131211 3212113212 13121
yielding a non-billiard length-116 ternary word, all of whose projections are
billiard words, but which cannot be further extended without invalidating some
projection.

Any prospect that the projection criterion might be strengthened via larger
samples is quashed by these examples. Another apparently promising test
is to check that the candidate, whose length-(n-1) prefix was drawn from a
precomputed list, has length-(n-1) suffix also a member of the same list.
Surprisingly, this test appears to reject no additional candidates besides
those already caught by projection: is there a (not terribly exciting) theorem
here?

Fred Lunnon

On 7/26/10, Fred lunnon <fred.lunnon@gmail.com> wrote:
...
On 7/25/10, Michael Kleber <michael.kleber@gmail.com> wrote:
...
Data update: the sequence through n=30 is
1,3,9,27,75,189,447,951,1911,3621,
 6513,11103,18267,29013,44691,67251,98547,140865,197679,272799,
 370659,497403,658371,859863,1110453,1420527,1799373,2260161,2815401,3479235,
 4269279
Looks more n^5ish than n^6ish, at a casual glance.
evalf(log(4269279/3479235)/log(30/29)); # 6.036081126
 evalf(log(3479235/2815401)/log(29/28)); # 6.033051457
No way --- it's order n^6 !!
...
The last term took something like 11 hours to count, so I think it's time to
 give Mma a break.
--Michael
Sterling effort.
I've been trying hard to upstage it by finding an explicit formula,
 but am beginning
 to understand that the 3-D case may well qualitatively harder than the 2-D.
There is a literature about generalisations of Sturmian sequences from
 2-D / 2-symbol alphabet to n-D, which connects them to MDCF's.
 This particular generalisation is called "billiard words" (or
 "billiards words") ---
 the "words" here being what we might call infinite sequences, and "factors"
 finite words occurring within them.
One reference which surveys these is the 2003 preprint available on the web:
    Laurent Vuillon "Balanced Words".
 A detailed study (also available on the web) is
  NICOLAS BEDARIDE "CLASSIFICATION OF ROTATIONS ON THE TORUS T^2"
The following may well be very relevant, but I can't get at them from home:
Jean-Pierre Borel "How to build billiard words using decimations"
  RAIRO-Theor. Inf. Appl. Volume 44, Number 1, January-March 2010 59--77
Jean-Pierre Borel "A geometrical Characterization of factors of
  multidimensional Billiards words and some Applications"
  Theoretical Computer Science 380 (2007) 286--303
  http://hal.archives-ouvertes.fr/hal-00465586/fr/
The particular difficulty afflicting billiards --- in comparison with,
 say, the more
 common Arnaud-Rauzy generalisation --- is their nonlinear complexity:
 apparently,
 the number of distinct factors of length n in any given trajectory of
 cubical billiards
 (Allan's original problem) is in general n^2 + n + 1 .
Fred Lunnon