Re: [math-fun] The game called "Set"
The goal of the problem is to determine (or merely bound) F(n), which is the maximum possible number of n-vectors over Z3, such that no three sum to 0. More precisely such that for any three, call them x,y,z, it can only happen that x+y+z=0 if x=y=z. As my first trivial remark... obviously, if it is the case that no three of the vectors x,y,z exhibit ANY linear relation a*x+b*y+c*z=0 over Z3, (a,b,c)!=(0,0,0), that suffices. For that, it in turn suffices if our set of (column) n-vectors, n>3, happens to form an "orthogonal array of strength 3." For that, it in turn suffices if the set of vectors is the dual of a hamming-distance-4 linear code over GF3. So it seems to me that by using the "duals of the extended ternary Hamming codes" we obtain a proof that F( (3^r-1)/2 + 1 ) >= 3^r. Oh. Well, that was a quite weak result, F(n) >= 2*n-1 for an infinite set of n. Supposedly it is known that F(n) > 2.21^n, a far stronger lower bound. Still, since Hamming codes are "perfect" my bound is presumably best possible using its technique. So this demonstrates that demanding all 26 linear relations be impossible causes a tremendously different effect than only demanding (what the problem calls for) it for the 7 relations (a,b,c) = (1,1,1) and perms(0,1,2). This is a very interesting problem. -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
On 4/3/15, Warren D Smith <warren.wds@gmail.com> wrote:
The goal of the problem is to determine (or merely bound) F(n), which is the maximum possible number of n-vectors over Z3, such that no three sum to 0. More precisely such that for any three, call them x,y,z, it can only happen that x+y+z=0 if x=y=z.
--Here is a simple randomized lower bound construction. Randomly fill in an nXm matrix with {-1,0,1}. The expected number of triples of columns which sum (mod 3) to 0, not all three columns the same, is (m^2+3*m-4)*(m/6)/3^n. If m+1.5<=6^(1/3) * 3^(n/3), then this expected number is <1. That means there is nonzero chance the matrix will yield a "SET," meaning a SET at least this size must exist. Hence THEOREM F(n) > 6^(1/3) * 3^(n/3) - 3/2. This is an exponential lower bound, although the growth constant is only 3^(1/3) =approx= 1.442 not as good as the best known 2.21. Now we definitely can improve things by using "improved random" constructions. Here is a simple example. Randomly fill in an nXm matrix with {-1,0,1}. Let C be the number of triples of columns which sum (mod 3) to 0, not all three columns the same. Remove one column from each such triple from the matrix, result being an nX(m-C) trit-matrix (or the horizontal size could be larger than m-C if two bad triples overlap) which is a SET. Again the expected number of triples of columns which sum (mod 3) to 0, not all three columns the same, is (m^2+3*m-4)*(m/6)/3^n. There must exist a random matrix fill which causes C to be its expected value or less, hence F(n)>=m-C>=m-(m^2+3*m-4)*(m/6)/3^n. (In fact ">".) This yields THEOREM F(n) > sqrt((216*T^3+756*T^2+882*T+343)/T^2)/(9*sqrt(3))-(T+1)/T where T=3^n. This also grows exponentially, but now with the improved growth constant 3^(2/3) =approx= 2.080. That's nearly as good as the best known constant 2.21 and it was very easy. An alternative "repair scheme" is, not to remove the C bad columns from the matrix, but rather replace their contents with newly made random trits. If (C/2)*(m^2)/3^n<=1, then this has a nonzero chance of being a successful repair, i.e. some successful repair must exist. That scheme sucks because the exponential growth factor we get is at most sqrt(3)=approx=1.732. Better is "keep doing these column replacements one bad column at a time except that whenever one fails, then delete that column instead of replacing it." That scheme definitely does better than the deletion-only scheme, but I haven't figured out how much better. I also suspect better lower bounds are possible by combining arguments involving randomization, with ones involving explicit constructions.
Now we definitely can improve things by using "improved random" constructions. Here is a simple example.
Randomly fill in an nXm matrix with {-1,0,1}. Let C be the number of triples of columns which sum (mod 3) to 0, not all three columns the same. Remove one column from each such triple from the matrix, result being an nX(m-C) trit-matrix (or the horizontal size could be larger than m-C if two bad triples overlap) which is a SET. Again the expected number of triples of columns which sum (mod 3) to 0, not all three columns the same, is (m^2+3*m-4)*(m/6)/3^n. There must exist a random matrix fill which causes C to be its expected value or less, hence F(n)>=m-C>=m-(m^2+3*m-4)*(m/6)/3^n. (In fact ">".) This yields THEOREM F(n) > sqrt((216*T^3+756*T^2+882*T+343)/T^2)/(9*sqrt(3))-(T+1)/T where T=3^n. This also grows exponentially, but now with the improved growth constant 3^(2/3) =approx= 2.080.
--correction. The above formula, obtained by maximizing m-(m^2+3*m-4)*(m/6)/T by choice of m>=1, actually grows ultimately exponentially with growth constant 3^(1/2) =approx= 1.732. Not 2.080.
SUPERMULTIPLICATIVITY THEOREM: F(a+b) >= F(a) * F(b) because of the cartesian product. [Because of the lemma: If, among three among the 2-digit mixed radix numbers with bases F(a) and F(b), the first digits all are equal, then the second digits must also all be equal, or else not.] Hence, F(k) >= 2^k because F(1)=2 F(3*k) >= 9^k because F(3)=9 F(4*k) >= 20^k because F(4)=20 F(5*k) >= 45^k because F(5)=45 F(6*k) >= 112^k because F(6)=112 the lattermost yielding the growth constant 112^(1/6) = 2.1955... which is almost as good as the best known 2.21. The 2.21 presumably arises from F(11*k) >= 6464^k because F(11)>=6464 which apparently was shown by Yves Edel: Extensions of Generalized Product Caps, Designs, Codes and Cryptography 31,1 (January 2004) 5-14 http://link.springer.com/article/10.1023%2FA%3A102736590123120 but I have not seen this paper, so this is not certain. Actually 6464^(1/11) = 2.22028749761019184633 more precisely, so actually we could have said 2.22 not 2.21, if it really is true that F(11)>=6464, improving in the last decimal place over the lower bound F(n)>2.21^n that it says in the OEIS entry -- at least for all large-enough n. Another lemma: if you have a "SET" then adding any constant to all its digits (mod 3) still yields a SET. The constant can, in fact, depend on the digit location. Similarly you can multiply the Kth digit by any nonzero constant (mod 3). Also, you can permute the n digit locations using any fixed n-permutation. These generally yield many "equivalent versions" of the SET. Still more generally, you can transform all the n-vectors by multiplying by any invertible nXn constant matrix (mod 3), or adding on any constant n-vector (mod 3). -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
OK, think I have a new lower bound on the SET problem F(n) which breaks all previous records for all large-enough n, and enjoys exponential growth factor 2.999, well actually 3-epsilon for any fixed epsilon>0, thus matching the upper bound except for the epsilon. Here is the construction. I am going to show F(n+m)>=3^n for all large-enough n, and where m=o(n). Simply put all possible trits {-1,0,1} in the first n locations in the (n+m)-vector, that's 3^n vectors right there. Now, the remaining m vector entries need to be filled in. Compute the sum of squares S of the first n entries in the vector (now as an integer; do not work mod 3). Note 0<=S<=n. Now, if we were to adjoin S as an (n+1)st vector entry, where this extra entry was not handled mod 3, it was just an ordinary integer, then the resulting (n+1)-vectors would do the job because S is a strictly concave-U function of the first n vector entries, hence if the first n happened to be (for three vectors) collinear equispaced, then their S's could not be. Now instead of thinking of S as an "integer," we can represent S by its residues modulo the first ln(2n) (approximately) primes, or they need not be primes, they could be any relatively prime set of numbers. (Chinese remainder theorem. Any residue representation valid for all numbers up to 2n.) That shows that if we instead were to adjoin to the n initial entries of each vector (each handled mod 3), about ln(2n) additional entries, each handled with its own modulus -- but the max modulus is only about ln(2n)*lnln(2n) -- then the resulting set of vectors would do the job too. But that still is not quite good enough since these moduli, although small, are bigger than 3. To fix that remaining flaw, we proceed recursively using the same trick. That is, at the next level of recursion, each one of those residues R, is itself represented via a residue representation valid for all numbers up to 2R. So now we've got about ln(2*n)*ln(2*ln(2*n)) additional entries, with max modulus of lnlnln size. And so on. After about logstar(n) levels of recursing -- where I define my logstar(n) function as the number of iterations of ln(2*x) needed to reduce n below some constant -- we have a set of (n+m)-vectors where m=ln(n)^logstar(n) approximately, and the first n entries of each vector are handled mod 3, but the remaining m entries re handled modulo various moduli <=C for some constant C, which do the job, i.e. no three are collinear. Now you may still complain that 3<C. Yah, but we now can use some constant size dictionary lookup to do the rest. So I claim to have sketchily proved the following ALLEGED THEOREM: For all large-enough integer n, there exists an integer m with 0<m<ln(n)^O(logstar(n)), such that F(n+m) >= 3^n. If true, this result is the new-record lower bound on F, and it is not maximally precisely stated, e.g. could be tightened somewhat further by anybody who wanted to struggle to be precise about the best than can be squeezed out of this argument. Caveat emptor, however, since this has been sketchy. -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
participants (1)
-
Warren D Smith