[math-fun] poisoned wine bottles
From: Michael Kleber <michael.kleber@gmail.com> You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle. Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj. Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0. The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.) So evidently for a solution to exist we need that binomial(W, P) <= 2^R. It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19. Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles. We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way. Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2? I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P. To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2. I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED. Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
I didn't quite follow this, Warren. What are you saying is almost tight? Just to be clear, the puzzle I'm passing along is: Come up with a way of assigning a subset of wine bottles to each rat, such that by observing which rats die you can produce a set of wine bottles known to be safe, where the set of safe bottles is as large as possible (either expected-value or worst-case size). I don't think most of your message was about this problem -- you gave a way to get 800 non-poisoned bottles, but certainly it is possible to do much better than that. --Michael On Thu, Nov 15, 2012 at 7:50 PM, Warren Smith <warren.wds@gmail.com> wrote:
From: Michael Kleber <michael.kleber@gmail.com> You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying. --ms On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com> You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle. --Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1). With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once. With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash. With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3: - With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3 So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them. To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m. Note that there's no claim that this approach is optimal. J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com>
You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun<http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun<http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
This approach gets us close to 900, but not quite there. How close does it get us? On Sat, Nov 17, 2012 at 9:02 AM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com>
You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- -- http://cube20.org/ -- http://golly.sf.net/ --
Thanks, J.P. This is indeed the best approach I know of. Well, except that if you're thinking worst-case, you've overlooked the rounding problems that come from the fact that 1000 is not a multiple of 4*4*5=80. You can certainly arrange the bottles into 4*4*5 sets of 12-or-13 each and discard at worst 104. Someone at work argued that you could do 102 if you're clever, though I didn't quite follow how. But I don't know of a way to get down to actually only discarding 100 bottles. On the lower-bound side of the argument, there are 1000-choose-2 possible pairs of poisoned bottles, and you can at best separate them into 1024 classes based on rat mortality -- so each rat-outcome forces you to discard the bottles from at least 488 pairs, which means a bare minimum of 32 bottles (since discarding 32 bottles tosses 32-choose-2 = 496 pairs). That's obviously wildly too few, though: you'd need to partition the bottles into lots of size 31-or-32 and always have the two poisoned bottles in the same lot. Can anyone tighten these bounds? --Michael On Sat, Nov 17, 2012 at 12:02 PM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com>
You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
With the help of OEIS I discovered some history for this problem. First I came up with a better term for the name of the relevant codes. A "unique union" code on n symbols is a set of subsets of these n symbols with the property that all pairwise unions of the subsets is unique. Here is an example of a unique union code of size 4 for n=3: 0 0 0 1 0 0 0 1 0 0 0 1 The 1's indicate the inclusion of the symbol in the subset. There is no larger unique union code for n=3, so M(3)=4. With only n=3 rats to test for poison, we would partition the bottles into equal (or as near to equal as possible) sets and deliver samples to the 3 rats according to the unique union code. If the two poisoned bottles are in different sets of the partition we will know their identities from the union of their codewords. On the other hand, if both poisoned bottles are in the same partition, then we will see just one codeword. But if this codeword is not the zero codeword, then we must consider the possibility that we are seeing the union of the non-zero codeword and the zero codeword. In either case, to be safe we must eliminate 2 partitions of the bottles. The fraction we can eliminate is therefore 2/M(n). J.P. Grossman's post describes a product construction of unique union codes with sizes M'(3k) = 4 2^(k-1) M'(3k+1) = 5 2^(k-1) M'(3k+2) = 6 2^(k-1) For n<8 these are optimal. Today I wrote a short program to find M(8)=13, an increase by 1 over the product code. Fortunately I now had enough of the M(n) sequence 1, 2, 3, 4, 5, 6, 8, 10, 13 to get a manageable output from OEIS. The rat-puzzle numbers turn out to have been studied by Erdos and others, see A054961. The sequence has been computed only to M(8)=13, but since already the value for n=8 is larger than the product code value, we can be sure that M'(10)=20 is also not optimal and so a larger fraction of the bottles can be saved. -Veit optimal n=8 code: 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 On Nov 17, 2012, at 12:16 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Thanks, J.P. This is indeed the best approach I know of.
Well, except that if you're thinking worst-case, you've overlooked the rounding problems that come from the fact that 1000 is not a multiple of 4*4*5=80. You can certainly arrange the bottles into 4*4*5 sets of 12-or-13 each and discard at worst 104. Someone at work argued that you could do 102 if you're clever, though I didn't quite follow how. But I don't know of a way to get down to actually only discarding 100 bottles.
On the lower-bound side of the argument, there are 1000-choose-2 possible pairs of poisoned bottles, and you can at best separate them into 1024 classes based on rat mortality -- so each rat-outcome forces you to discard the bottles from at least 488 pairs, which means a bare minimum of 32 bottles (since discarding 32 bottles tosses 32-choose-2 = 496 pairs). That's obviously wildly too few, though: you'd need to partition the bottles into lots of size 31-or-32 and always have the two poisoned bottles in the same lot.
Can anyone tighten these bounds?
--Michael
On Sat, Nov 17, 2012 at 12:02 PM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com>
You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
As a matter of curiosity a rather poor (but cute) lower bound can be obtained for the unique union code size by restricting oneself to two element subsets of the given n-set. One gets a classical graph theory problem, namely: If a(n) is the maximum number of edges in an n-vertex graph of girth 5. Then a(n) <= M(n). If G is a (simple) graph on n vertices with girth 5 then G has no 3-cycles or 4-cycles. This implies that the set of edges is a unique union code on n symbols. [Here an edge is a 2 element set of vertices.] Thus a(n) <= M(n). In particular, the value a(10) = 15 is given by the Petersen graph<http://en.wikipedia.org/wiki/Petersen_graph>. Values of a(n) for n to 30 are given in http://oeis.org/A006856. The extremal graphs can be found in the paper Extremal graphs without three-cycles or four-cycles<http://www.math.udel.edu/%7Elazebnik/papers/ex34a_v2.pdf>by Garnick, et al. --Edwin On Mon, Nov 19, 2012 at 8:42 PM, Veit Elser <ve10@cornell.edu> wrote:
With the help of OEIS I discovered some history for this problem. First I came up with a better term for the name of the relevant codes.
A "unique union" code on n symbols is a set of subsets of these n symbols with the property that all pairwise unions of the subsets is unique.
Here is an example of a unique union code of size 4 for n=3:
0 0 0 1 0 0 0 1 0 0 0 1
The 1's indicate the inclusion of the symbol in the subset. There is no larger unique union code for n=3, so M(3)=4.
With only n=3 rats to test for poison, we would partition the bottles into equal (or as near to equal as possible) sets and deliver samples to the 3 rats according to the unique union code. If the two poisoned bottles are in different sets of the partition we will know their identities from the union of their codewords. On the other hand, if both poisoned bottles are in the same partition, then we will see just one codeword. But if this codeword is not the zero codeword, then we must consider the possibility that we are seeing the union of the non-zero codeword and the zero codeword. In either case, to be safe we must eliminate 2 partitions of the bottles. The fraction we can eliminate is therefore 2/M(n).
J.P. Grossman's post describes a product construction of unique union codes with sizes
M'(3k) = 4 2^(k-1) M'(3k+1) = 5 2^(k-1) M'(3k+2) = 6 2^(k-1)
For n<8 these are optimal. Today I wrote a short program to find M(8)=13, an increase by 1 over the product code. Fortunately I now had enough of the M(n) sequence
1, 2, 3, 4, 5, 6, 8, 10, 13
to get a manageable output from OEIS. The rat-puzzle numbers turn out to have been studied by Erdos and others, see A054961. The sequence has been computed only to M(8)=13, but since already the value for n=8 is larger than the product code value, we can be sure that M'(10)=20 is also not optimal and so a larger fraction of the bottles can be saved.
-Veit
optimal n=8 code:
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1
On Nov 17, 2012, at 12:16 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Thanks, J.P. This is indeed the best approach I know of.
Well, except that if you're thinking worst-case, you've overlooked the rounding problems that come from the fact that 1000 is not a multiple of 4*4*5=80. You can certainly arrange the bottles into 4*4*5 sets of 12-or-13 each and discard at worst 104. Someone at work argued that you could do 102 if you're clever, though I didn't quite follow how. But I don't know of a way to get down to actually only discarding 100 bottles.
On the lower-bound side of the argument, there are 1000-choose-2 possible pairs of poisoned bottles, and you can at best separate them into 1024 classes based on rat mortality -- so each rat-outcome forces you to discard the bottles from at least 488 pairs, which means a bare minimum of 32 bottles (since discarding 32 bottles tosses 32-choose-2 = 496 pairs). That's obviously wildly too few, though: you'd need to partition the bottles into lots of size 31-or-32 and always have the two poisoned bottles in the same lot.
Can anyone tighten these bounds?
--Michael
On Sat, Nov 17, 2012 at 12:02 PM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com>
You have 1000 bottles of wine, which you need for a party that starts in an hour. Two of the bottles are poisoned. You have ten rats, and you can cause each rat to drink from any subset of the bottles, which will cause it to die if the bottle is poisoned. But the poison takes most of an hour to act, so you only get one round of giving wine to rats, after which you must throw away any bottle which you do not know is safe. What do you do? I'd be happy to hear either worst-case or average-case results. Obviously if only one bottle were poisoned, you'd number the bottles and assign one rat to each bit of the binary representation, and the dead rats would spell out the binary of the unique poisoned bottle number. But two poisoned bottles seems to make this much more subtle.
--Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I have a solution that loses 104 bottles that Michael asked me to share with you all. Take the first 130 bottles and give each rat 13 of them. Then take the next 225 bottles and give each pair of rats 5 of them. Then take the next 600 bottles and give each triplet of rats 5 of them. There are 45 bottles left over that no rats drink. Now look at the cases: 0 rats die => lose 45 bottles 1 rat dies => lose 13 + 45 = 58 bottles 2 rats die => lose 2*13 + 5 + 45 = 76 bottles 3 rats die => lose 3*13 + 3*5 + 5 + 45 = 104 bottles 4 rats die => lose 4*13 + 6*5 + 4*5 = 102 bottles 5 rats die => lose 10*5 + 10*5 = 100 bottles 6 rats die => lose 20*5 = 100 bottles -Thomas C On Mon, Nov 19, 2012 at 10:46 PM, W. Edwin Clark <wclark@mail.usf.edu>wrote:
As a matter of curiosity a rather poor (but cute) lower bound can be obtained for the unique union code size by restricting oneself to two element subsets of the given n-set. One gets a classical graph theory problem, namely:
If a(n) is the maximum number of edges in an n-vertex graph of girth 5. Then a(n) <= M(n).
If G is a (simple) graph on n vertices with girth 5 then G has no 3-cycles or 4-cycles. This implies that the set of edges is a unique union code on n symbols. [Here an edge is a 2 element set of vertices.] Thus a(n) <= M(n). In particular, the value a(10) = 15 is given by the Petersen graph<http://en.wikipedia.org/wiki/Petersen_graph>.
Values of a(n) for n to 30 are given in http://oeis.org/A006856. The extremal graphs can be found in the paper Extremal graphs without three-cycles or four-cycles<http://www.math.udel.edu/%7Elazebnik/papers/ex34a_v2.pdf>by Garnick, et al.
--Edwin
On Mon, Nov 19, 2012 at 8:42 PM, Veit Elser <ve10@cornell.edu> wrote:
With the help of OEIS I discovered some history for this problem. First I came up with a better term for the name of the relevant codes.
A "unique union" code on n symbols is a set of subsets of these n symbols with the property that all pairwise unions of the subsets is unique.
Here is an example of a unique union code of size 4 for n=3:
0 0 0 1 0 0 0 1 0 0 0 1
The 1's indicate the inclusion of the symbol in the subset. There is no larger unique union code for n=3, so M(3)=4.
With only n=3 rats to test for poison, we would partition the bottles into equal (or as near to equal as possible) sets and deliver samples to the 3 rats according to the unique union code. If the two poisoned bottles are in different sets of the partition we will know their identities from the union of their codewords. On the other hand, if both poisoned bottles are in the same partition, then we will see just one codeword. But if this codeword is not the zero codeword, then we must consider the possibility that we are seeing the union of the non-zero codeword and the zero codeword. In either case, to be safe we must eliminate 2 partitions of the bottles. The fraction we can eliminate is therefore 2/M(n).
J.P. Grossman's post describes a product construction of unique union codes with sizes
M'(3k) = 4 2^(k-1) M'(3k+1) = 5 2^(k-1) M'(3k+2) = 6 2^(k-1)
For n<8 these are optimal. Today I wrote a short program to find M(8)=13, an increase by 1 over the product code. Fortunately I now had enough of the M(n) sequence
1, 2, 3, 4, 5, 6, 8, 10, 13
to get a manageable output from OEIS. The rat-puzzle numbers turn out to have been studied by Erdos and others, see A054961. The sequence has been computed only to M(8)=13, but since already the value for n=8 is larger than the product code value, we can be sure that M'(10)=20 is also not optimal and so a larger fraction of the bottles can be saved.
-Veit
optimal n=8 code:
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1
On Nov 17, 2012, at 12:16 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Thanks, J.P. This is indeed the best approach I know of.
Well, except that if you're thinking worst-case, you've overlooked the rounding problems that come from the fact that 1000 is not a multiple of 4*4*5=80. You can certainly arrange the bottles into 4*4*5 sets of 12-or-13 each and discard at worst 104. Someone at work argued that you could do 102 if you're clever, though I didn't quite follow how. But I don't know of a way to get down to actually only discarding 100 bottles.
On the lower-bound side of the argument, there are 1000-choose-2 possible pairs of poisoned bottles, and you can at best separate them into 1024 classes based on rat mortality -- so each rat-outcome forces you to discard the bottles from at least 488 pairs, which means a bare minimum of 32 bottles (since discarding 32 bottles tosses 32-choose-2 = 496 pairs). That's obviously wildly too few, though: you'd need to partition the bottles into lots of size 31-or-32 and always have the two poisoned bottles in the same lot.
Can anyone tighten these bounds?
--Michael
On Sat, Nov 17, 2012 at 12:02 PM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1)
2/5
= 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
From: Michael Kleber <michael.kleber@gmail.com> > You have 1000 bottles of wine, which you need for a party that starts in > an > hour. Two of the bottles are poisoned. > You have ten rats, and you can cause each rat to drink from any > subset of > the bottles, which will cause it to die if the bottle is poisoned. But > the > poison takes most of an hour to act, so you only get one round of giving > wine to rats, after which you must throw away any bottle which you do not > know is safe. > What do you do? I'd be happy to hear either worst-case or > average-case > results. > Obviously if only one bottle were poisoned, you'd number the bottles > and > assign one rat to each bit of the binary representation, and the dead > rats > would spell out the binary of the unique poisoned bottle number. But two > poisoned bottles seems to make this much more subtle. > --Interesting puzzle.
Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, and R the number of rats. The task is to identify the poisoned wine bottles with a single round of feeding each rat a wine-mixture from a selected bottle-subset. Say rat j gets fed subset Sj.
Consider this WxR boolean matrix M: M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0.
The essential property demanded (for a worst-case result allowing us to identify the P poisoned bottles) is that for any P-element subset of the rows of the matrix, its logical-OR (as an R-bit binary word) is DISTINCT from that arising from any other P-element subset. (This logical OR's locations for its 1-bits give the names of the dead rats.)
So evidently for a solution to exist we need that binomial(W, P) <= 2^R.
It was proposed that W=1000, P=2. Hence the left hand side is 499500=1000*999/2. Hence a solution cannot exist unless there are at least 19 rats, R>=19.
Since Kleber had R=10, we conclude there is no solution guaranteed to identify the two poisoned bottles.
We can however adopt the weaker goal of trying to identify some subset of bottles we can guarantee are safe. In that case, just partition the bottles into R equal-size subsets each with W/R bottles. Then P rats die whereupon we've identified (R-P)*W/R safe wine bottles. With R=10, P=2, W=1000, we identify 800 safe ones this way.
Returning to the original problem of identifying the poisoned bottles, Kleber's binary method shows the inequality I gave is tight (up to integer roundoff) if P=1. But what if P>=2?
I will now attempt to sketch an argument that my trivial inequality is not too far off being tight, for any FIXED P.
To prove this, consider selecting the matrix M at random, each bit chosen by an i.i.d. coin toss, where the coin used is biased with a particular well-chosen constant bias. Specifically, I choose the bias in such a way that the expected number of 1-bits in the R-vector got by ORing P rows of the matrix, comes out to be R/2.
I claim this argument will show that R = C * lg( binomial(W,P) ) rats suffice, for some positive constant C. To prove this (i.e. to supply the details of the proof, which I have not provided since I am lazy), you need to argue that with positive probability, all those P-row-ORs will indeed be distinct. This can be done by arguing the expected number of conflicting P-row-subset pairs is <1. Then since probability>0, that proves some set of such coin tosses must exist, which have this property, hence a suitable experimental-design exists. QED.
Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats and somebody less lazy than I could even work out a good bound on the constant C (1<C<5 sounds safe).
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Does anyone know who first thought of the following proof that sqrt(2) is irrational? It's definitely my favorite. Suppose that the isosceles right triangle whose legs are 1 has hypotenuse = p/q with p,q in Z+. By magnifying it by a factor of q, we get the right triangle with sides q, q, p. Call this triangle ABC, with A at the right angle. Now draw arc of the circle O with center at B and radius q, from side AB to the hypotenuse. Assume it hits the hypotenuse at point X. Draw a perpendicular to the hypotenuse at X, extended to hit side AC, say at point Y. Then the segments CX and XY are of equal length p-q. And since the lines AY and XY are tangents to the circle O from an external point (namely Y), they are equal length, namely p-q, an integer. This means that the segment CY must be of length q-(p-q) = 2q-p, also an integer. Hence the triangle CXY is another, smaller, isosceles right triangle with integer sides. Repeating this process leads to an infinite regress of integer right triangles each with smaller sides than the last one: contradiction. --Dan
Dan Asimov:
Does anyone know who first thought of the following proof that sqrt(2) is irrational?
It looks like Tom M. Apostol's proof, published as "Irrationality of The Square Root of Two - A Geometric Proof" in the American Mathematical Monthly, November 2000, pp. 841–842.
Thanks, Hans! --Dan On 2012-11-20, at 5:05 PM, Hans Havermann wrote:
Dan Asimov:
Does anyone know who first thought of the following proof that sqrt(2) is irrational?
It looks like Tom M. Apostol's proof, published as "Irrationality of The Square Root of Two - A Geometric Proof" in the American Mathematical Monthly, November 2000, pp. 841–842. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101. Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles. -Thomas C On Tue, Nov 20, 2012 at 1:36 PM, Thomas Colthurst <thomaswc@gmail.com>wrote:
I have a solution that loses 104 bottles that Michael asked me to share with you all.
Take the first 130 bottles and give each rat 13 of them. Then take the next 225 bottles and give each pair of rats 5 of them. Then take the next 600 bottles and give each triplet of rats 5 of them. There are 45 bottles left over that no rats drink.
Now look at the cases: 0 rats die => lose 45 bottles 1 rat dies => lose 13 + 45 = 58 bottles 2 rats die => lose 2*13 + 5 + 45 = 76 bottles 3 rats die => lose 3*13 + 3*5 + 5 + 45 = 104 bottles 4 rats die => lose 4*13 + 6*5 + 4*5 = 102 bottles 5 rats die => lose 10*5 + 10*5 = 100 bottles 6 rats die => lose 20*5 = 100 bottles
-Thomas C
On Mon, Nov 19, 2012 at 10:46 PM, W. Edwin Clark <wclark@mail.usf.edu>wrote:
As a matter of curiosity a rather poor (but cute) lower bound can be obtained for the unique union code size by restricting oneself to two element subsets of the given n-set. One gets a classical graph theory problem, namely:
If a(n) is the maximum number of edges in an n-vertex graph of girth 5. Then a(n) <= M(n).
If G is a (simple) graph on n vertices with girth 5 then G has no 3-cycles or 4-cycles. This implies that the set of edges is a unique union code on n symbols. [Here an edge is a 2 element set of vertices.] Thus a(n) <= M(n). In particular, the value a(10) = 15 is given by the Petersen graph<http://en.wikipedia.org/wiki/Petersen_graph>.
Values of a(n) for n to 30 are given in http://oeis.org/A006856. The extremal graphs can be found in the paper Extremal graphs without three-cycles or four-cycles<http://www.math.udel.edu/%7Elazebnik/papers/ex34a_v2.pdf>by Garnick, et al.
--Edwin
On Mon, Nov 19, 2012 at 8:42 PM, Veit Elser <ve10@cornell.edu> wrote:
With the help of OEIS I discovered some history for this problem. First I came up with a better term for the name of the relevant codes.
A "unique union" code on n symbols is a set of subsets of these n symbols with the property that all pairwise unions of the subsets is unique.
Here is an example of a unique union code of size 4 for n=3:
0 0 0 1 0 0 0 1 0 0 0 1
The 1's indicate the inclusion of the symbol in the subset. There is no larger unique union code for n=3, so M(3)=4.
With only n=3 rats to test for poison, we would partition the bottles into equal (or as near to equal as possible) sets and deliver samples to the 3 rats according to the unique union code. If the two poisoned bottles are in different sets of the partition we will know their identities from the union of their codewords. On the other hand, if both poisoned bottles are in the same partition, then we will see just one codeword. But if this codeword is not the zero codeword, then we must consider the possibility that we are seeing the union of the non-zero codeword and the zero codeword. In either case, to be safe we must eliminate 2 partitions of the bottles. The fraction we can eliminate is therefore 2/M(n).
J.P. Grossman's post describes a product construction of unique union codes with sizes
M'(3k) = 4 2^(k-1) M'(3k+1) = 5 2^(k-1) M'(3k+2) = 6 2^(k-1)
For n<8 these are optimal. Today I wrote a short program to find M(8)=13, an increase by 1 over the product code. Fortunately I now had enough of the M(n) sequence
1, 2, 3, 4, 5, 6, 8, 10, 13
to get a manageable output from OEIS. The rat-puzzle numbers turn out to have been studied by Erdos and others, see A054961. The sequence has been computed only to M(8)=13, but since already the value for n=8 is larger than the product code value, we can be sure that M'(10)=20 is also not optimal and so a larger fraction of the bottles can be saved.
-Veit
optimal n=8 code:
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1
On Nov 17, 2012, at 12:16 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Thanks, J.P. This is indeed the best approach I know of.
Well, except that if you're thinking worst-case, you've overlooked the rounding problems that come from the fact that 1000 is not a multiple of 4*4*5=80. You can certainly arrange the bottles into 4*4*5 sets of 12-or-13 each and discard at worst 104. Someone at work argued that you could do 102 if you're clever, though I didn't quite follow how. But I don't know of a way to get down to actually only discarding 100 bottles.
On the lower-bound side of the argument, there are 1000-choose-2 possible pairs of poisoned bottles, and you can at best separate them into 1024 classes based on rat mortality -- so each rat-outcome forces you to discard the bottles from at least 488 pairs, which means a bare minimum of 32 bottles (since discarding 32 bottles tosses 32-choose-2 = 496 pairs). That's obviously wildly too few, though: you'd need to partition the bottles into lots of size 31-or-32 and always have the two poisoned bottles in the same lot.
Can anyone tighten these bounds?
--Michael
On Sat, Nov 17, 2012 at 12:02 PM, J.P. Grossman <jpg@alum.mit.edu> wrote:
Another way to think of the P=1 case is that each additional rat allows us to reduce the number of dangerous (possibly poisoned) bottles by 1/2. When P=2 a single rat doesn't allow us to eliminate any dangerous bottles, but Mike observed that with k rats we can reduce the number by 2/(k+1).
With 4 rats we can directly reduce the number by 2/5, or we can divide the rats into 2+2 and reduce the number by 2/3 * 2/3 > 2/5. Better to use all four rats at once.
With 5 rats we can directly reduce the number by 2/6 = 1/3, or we can divide the rats into 2+3 and reduce the number by 2/3 * 2/4 = 2/3. It's a wash.
With > 5 rats it's always better to subdivide, and we find that the optimal division or rats within this approach depends on the number of rats mod 3:
- With 3k rats, use k * 3 to reduce the number of dangerous bottles by 1/2^k - With 3k+1 rats, use (k-1) * 3 + 4 to reduce the number by 1/2^(k-1) * 2/5 = 1/2^(k-2) * 1/5 - With 3k+2 rats, use k * 3 + 2 to reduce the number by 1/2^k * 2/3 = 1/2^(k-1) * 1/3
So in particular with 10 rats, we use the division 3 + 3 + 4 to reduce the number of dangerous bottles by 1/10, meaning that with 1000 bottles we can save 900 of them.
To make this approach concrete, write the bottle numbers with a mixed-radix representation using radix 4 for all digits but the last, and radix 3, 4 or 5 for the last digit depending on whether the number of rats is 2, 0 or 1 (respectively) mod 3. Each group of 2-4 rats in the partition gets assigned to one digit, and within that group a bottle is given to rat m if, for that bottle, the selected digit is m.
Note that there's no claim that this approach is optimal.
J.P. On Fri, Nov 16, 2012 at 9:00 AM, Mike Speciner <ms@alum.mit.edu> wrote:
You can trivially do better than 800 by dividing the 1000 into 11 (almost) equal subsets, and assign a rat to each but one of them. You end up guaranteeing 9/11 of the 1000 bottles safe. Presumably you can do still better on average with an algorithm that potentially has different numbers of rats dying.
--ms
On 2012-11-15 19:50, Warren Smith wrote:
> From: Michael Kleber <michael.kleber@gmail.com> >> You have 1000 bottles of wine, which you need for a party that starts in >> an >> hour. Two of the bottles are poisoned. >> You have ten rats, and you can cause each rat to drink from any >> subset of >> the bottles, which will cause it to die if the bottle is poisoned. But >> the >> poison takes most of an hour to act, so you only get one round of giving >> wine to rats, after which you must throw away any bottle which you do not >> know is safe. >> What do you do? I'd be happy to hear either worst-case or >> average-case >> results. >> Obviously if only one bottle were poisoned, you'd number the bottles >> and >> assign one rat to each bit of the binary representation, and the dead >> rats >> would spell out the binary of the unique poisoned bottle number. But two >> poisoned bottles seems to make this much more subtle. >> > --Interesting puzzle. > > Let W be the number of wine bottles, P the number poisoned, 1<=P<=W, > and R the number of rats. The task is to identify the poisoned wine > bottles > with a single round of feeding each rat a wine-mixture from a selected > bottle-subset. > Say rat j gets fed subset Sj. > > Consider this WxR boolean matrix M: > M_ij = 1 if wine bottle i is in set Sj. Otherwise M_ij=0. > > The essential property demanded (for a worst-case result allowing us > to identify the P poisoned bottles) is that for any P-element subset > of the rows of the matrix, its logical-OR (as an R-bit binary word) is > DISTINCT from that arising from any other P-element subset. (This > logical OR's locations for its 1-bits give the names of the dead > rats.) > > So evidently for a solution to exist we need that > binomial(W, P) <= 2^R. > > It was proposed that W=1000, P=2. Hence the left hand side is > 499500=1000*999/2. Hence a solution cannot exist unless there > are at least 19 rats, R>=19. > > Since Kleber had R=10, we conclude there is no solution guaranteed to > identify the > two poisoned bottles. > > We can however adopt the weaker goal of trying to identify some subset of > bottles we can guarantee are safe. > In that case, just partition the bottles into R equal-size subsets each > with W/R > bottles. Then P rats die whereupon we've identified (R-P)*W/R safe > wine bottles. > With R=10, P=2, W=1000, we identify 800 safe ones this way. > > Returning to the original problem of identifying the poisoned bottles, > Kleber's binary method shows the inequality I gave is tight (up to > integer roundoff) > if P=1. But what if P>=2? > > I will now attempt to sketch an argument that my trivial inequality is > not too far off being tight, for any FIXED P. > > To prove this, consider selecting the matrix M at random, each bit chosen > by > an i.i.d. coin toss, where the coin used is biased with a particular > well-chosen > constant bias. Specifically, I choose the bias in such a way that the > expected > number of 1-bits in the R-vector got by ORing P rows of the matrix, comes > out to be R/2. > > I claim this argument will show that > R = C * lg( binomial(W,P) ) > rats suffice, for some positive constant C. > To prove this (i.e. to supply the details of the proof, which I have > not provided since I am lazy), you need to argue that with positive > probability, all those P-row-ORs will > indeed be distinct. This can be done by arguing the expected number of > conflicting P-row-subset pairs is <1. > Then since probability>0, that proves some set of such coin tosses > must exist, which have this property, hence a suitable > experimental-design exists. > QED. > > Hence, with P=2, since 2 is fixed, there is a solution with R=O(logW) rats > and somebody less lazy than I could even work out a good bound on the > constant C > (1<C<5 sounds safe). > > ______________________________**_________________ > math-fun mailing list > math-fun@mailman.xmission.com > http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun> > >
______________________________**_________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/**cgi-bin/mailman/listinfo/math-**fun< http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun>
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On Nov 20, 2012, at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com> wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Wow! I was looking for exactly this object. How did you succeed? I noticed a pattern among the optimal codes for n=4,6,8 : 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 If we think of the columns as codewords (of codes of length 4, 6, 8), these are all constant weight, constant distance: (w,d)=(1,2), (3,4), (4,6). I have no idea why the columns should have this symmetrical structure, it is just an observation. My next step was going to be to look for a constant weight, constant distance code of size 10, with codewords of length 21. But you beat me to it! Your code (transpose) has constant weight 7 and constant distance 10. BTW, for the poisoned bottle-pair application one of the codewords of the "unique union" code must be the zero codeword. But all the optimal codes so far have this property. -Veit
On Wed, Nov 21, 2012 at 9:14 AM, Veit Elser <ve10@cornell.edu> wrote:
On Nov 20, 2012, at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com> wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Wow! I was looking for exactly this object. How did you succeed?
#!/usr/bin/python def circle_shift(n, s): a = n * (2 ** s) return (a % 1024) + a/1024 for i in xrange(1024): for j in xrange(1024): nums = [circle_shift(i, k) for k in xrange(10)] + [circle_shift(j, k) for k in xrange(10)] ors = [a | b for a in nums for b in nums if a < b] if len(ors) == len(set(ors)): print bin(i), bin(j)
I noticed a pattern among the optimal codes for n=4,6,8 :
0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 0 0 0 1 1
The n=6 code reminds me that I had a question about the product construction of unique union codes: how does it work exactly? For example, how do you put two copies of the n=3 code (000, 001, 010, 100) together to get an n=6 code? -Thomas C
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1
If we think of the columns as codewords (of codes of length 4, 6, 8), these are all constant weight, constant distance: (w,d)=(1,2), (3,4), (4,6). I have no idea why the columns should have this symmetrical structure, it is just an observation. My next step was going to be to look for a constant weight, constant distance code of size 10, with codewords of length 21. But you beat me to it! Your code (transpose) has constant weight 7 and constant distance 10.
BTW, for the poisoned bottle-pair application one of the codewords of the "unique union" code must be the zero codeword. But all the optimal codes so far have this property.
-Veit
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On Nov 22, 2012, at 9:44 AM, Thomas Colthurst <thomaswc@gmail.com> wrote:
The n=6 code reminds me that I had a question about the product construction of unique union codes: how does it work exactly? For example, how do you put two copies of the n=3 code (000, 001, 010, 100) together to get an n=6 code?
-Thomas C
I was wrong about Grossman's construction being a product code in the original sense of a "unique union code", but it works in Michael's generalization, where there is a parameter k which bounds the equivocation, i.e. number of codewords that can produce a given pairwise union. Say we have code 1 with parameters (n1,k1) and code 2 with parameters (n2,k2), then the product code has parameters (n1+n2,k1*k2). Let M(n,k) be the maximum size code with parameters (n,k); we then have the product-code bounds M(10,4) >= M(2,2)M(8,2) = 3*13 = 39 M(10,4) >= M(3,2)M(7,2) = 4*10 = 40 M(10,4) >= M(4,2)M(6,2) = 5*8 = 40 M(10,4) >= M(5,2)M(5,2) = 6*6 = 36 The best gives a bottle-elimination ratio 4/40 = 1/10, which is beat by your size 21 (10,2)-code (ratio 2/21) and Michael's size 46 (10,4)-code (ratio 4/46). -Veit
On Wed, Nov 21, 2012 at 9:14 AM, Veit Elser <ve10@cornell.edu> wrote:
On Nov 20, 2012, at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com> wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Wow! I was looking for exactly this object. How did you succeed?
I noticed a pattern among the optimal codes for n=4,6,8 :
0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1
0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 0 0 1 0 1 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1
The n=6 and n=8 patterns can be re-arranged (per permuting rows and columns) into n=6: 000 000 001 001 010 010 100 100 001 110 010 101 100 011 111 000 n=8: 0000 0000 0001 0001 0010 0010 0100 0100 1000 1000 0001 1010 0010 1100 0100 1001 1010 0001 1100 0010 1001 0100 0000 0111 0111 0000 I've been playing with these because I really want to find an unique union code for n=15 of size >1000 (i.e., show that you can uniquely identify the two poisoned wine bottles with only 15 bottles). -Thomas C
If we think of the columns as codewords (of codes of length 4, 6, 8), these are all constant weight, constant distance: (w,d)=(1,2), (3,4), (4,6). I have no idea why the columns should have this symmetrical structure, it is just an observation. My next step was going to be to look for a constant weight, constant distance code of size 10, with codewords of length 21. But you beat me to it! Your code (transpose) has constant weight 7 and constant distance 10.
BTW, for the poisoned bottle-pair application one of the codewords of the "unique union" code must be the zero codeword. But all the optimal codes so far have this property.
-Veit
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On Thu, Nov 29, 2012 at 11:53 AM, Thomas Colthurst <thomaswc@gmail.com>wrote:
I've been playing with these because I really want to find an unique union code for n=15 of size >1000 (i.e., show that you can uniquely identify the two poisoned wine bottles with only 15 bottles).
That's below the information-theoretic minimum. There are 1000-choose-2 = 499500 pairs of wine bottles; only with 19 rats could the 2^19 = 524288 outcomes possibly pick out a unique one. --Michael -- Forewarned is worth an octopus in the bush.
I think Thomas meant that he hoped to waste only 13 non-poisoned bottles. How many 15-sets does it take to cover all the 2-sets out of 1000? On Thu, Nov 29, 2012 at 12:46 PM, Michael Kleber <michael.kleber@gmail.com>wrote:
On Thu, Nov 29, 2012 at 11:53 AM, Thomas Colthurst <thomaswc@gmail.com
wrote:
I've been playing with these because I really want to find an unique
union
code for n=15 of size >1000 (i.e., show that you can uniquely identify the two poisoned wine bottles with only 15 bottles).
That's below the information-theoretic minimum. There are 1000-choose-2 = 499500 pairs of wine bottles; only with 19 rats could the 2^19 = 524288 outcomes possibly pick out a unique one.
--Michael
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I've just recently become a fan of your cyclic codes, so I wasn't expecting this. What structure are you highlighting? With just a little messing around I found a 36 element n=9, k=4 code by cycling {0,0,0,0,0,0,0,0,1} {0,0,0,0,0,1,0,1,1} {0,0,0,0,1,0,1,0,1} {0,0,0,0,1,1,0,0,1} giving rate (1/9)log_2(36/4) = 0.352214 and missing Michael's record by .00014 -- rats! -Veit On Nov 29, 2012, at 11:53 AM, Thomas Colthurst <thomaswc@gmail.com> wrote:
The n=6 and n=8 patterns can be re-arranged (per permuting rows and columns) into
n=6: 000 000
001 001 010 010 100 100
001 110 010 101 100 011
111 000
n=8: 0000 0000
0001 0001 0010 0010 0100 0100 1000 1000
0001 1010 0010 1100 0100 1001
1010 0001 1100 0010 1001 0100
0000 0111 0111 0000
I've been playing with these because I really want to find an unique union code for n=15 of size >1000 (i.e., show that you can uniquely identify the two poisoned wine bottles with only 15 bottles).
-Thomas C
In case anyone else is still interested in rats and wine bottles: Last weekend I C++ified my search for "capped union codes" with symmetry. Top-line result: I haven't found anything that beats the 46/4 code that I posted here on 11/23 (and gets down to discarding 88 of the 1000 bottles of wine). I did find a unique-union code of size 22, beating Thomas's 21, but this only brings a 96-bottle solution down to 92 bottles. If we restrict our attention, as Thomas proposed, to codes that have full 10-fold cyclic symmetry, then with a cap of k=2/3/4/5/6/7/8/9 the maximal size of a capped union code is 21/31/46/56/66/76/86/96. Hooray for exhaustive search. (As before, a cap of k means that for any word w that is a pairwise union of two (not necessarily distinct) words in the code, there are at most k codewords that can participate in such a pairwise union to form w.) For rats and bottles, the 46/4 codes are far better than anything else. We could loosen our symmetry requirement, though, and allow just 5-fold symmetry instead. (Just to be clear, that means that if e.g. the word 111100000 is in our code, then so are the four other words you get by rotating it 2 bits at a time: 0011110000, 0000111100, 0000001111, 1100000011). Unfortunately, this makes the search space quite a bit larger. With 10-fold symmetry and a cap of 2/3/4, the number of locally maximal codes is 120 / 779 / 12489. If we only demand 5-fold symmetry, the corresponding sizes are 20215, ~1.7 million, ~220 million. These size estimates come from a clever observation I learned from Knuth at some point: the number of leaves in a tree is the average value of the "product of number of children" statistic for a random path down through the tree, where "random" means "start at the root and from each node pick uniformly at random from its children until you reach a leaf." I ran the exhaustive search for 5-fold cap=2 and didn't break the code size record of 21 already attained by 10-fold symmetric codes. For cap=3, though, with 5-fold symmetry you can inch up to a code of size 32, instead of 31! One of many such codes: the six words {0000000101, 0000001111, 0000100011, 0001010011, 0001100110, 0010001001} plus the symmetry-required four rotations of each, along with two singletons {0000000000} and {1010101010}. For cap=4 the search is still running, but it's nearly done and has not exceeded the 46/4 record. It has produced fully 494 codes of that size, though. I've tried higher caps and the search has become much too long to wait for it to finish, but for the bits of the space I have explored, not much to report: I haven't beaten the 10-fold rotation record in any case other than cap=3. Finally, we can go wholer hog and ask about just 2-fold rotational symmetry! For cap=2 the search space is around 36 billion and for cap=4 it seems to be size 10^18, so, um, not going to explore the whole thing. (That would be 80K years if we could explore 1 code per microsecond!) However, I have found some 22/2 codes this way! For example: {1, 32, 6, 192, 24, 768, 75, 354, 108, 387, 149, 676, 178, 581, 269, 424, 308, 649, 337, 554, 531, 624}. I've written codewords as decimal integers in [0,1023] this time, to save space; I'm sure you can all translate to binary yourselves. These 22/2 codes aren't particularly rare; I've seen hundreds of them. There! Aren't you glad you asked? --Michael
I also haven't found any improvements to the 46/4 code. But here are a couple of curiosities. First, I went the opposite direction with regard to symmetry. Since the optimal codes have sizes that grow exponentially with n, my speculation was that eventually they will outgrow the cyclic group. So I looked at low order transitive subgroups of S_n, whose orders must be multiples of n. For n=10 there is a subgroup of order 60 isomorphic to the icosahedral group (the 10 rats correspond to the axes of 3-fold symmetry). But the best I got for that group was 75/7. So n=10 may still be too small. For fun I looked at n=13 and found a remarkable 53/2 code. One of the code words is the zero codeword and the others are orbits under the dihedral group (order 26) of the codewords {0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1} {0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1} So this counts as an example of a slightly larger transitive subgroup. It turns out both codewords above are constant autocorrelation sequences. This code is the current record holder for the "rate": (1/13) log_2(53/2) = 0.363686 Michael's 46/4 code has rate 0.352356. The best upper bound I have for the rate is 1/2. (Note: my units for rate is the bit-per-rat, or BPR, rather than the NPR.) The 52/2 code you get by taking away the zero codeword might be the basis of a card trick where the magician amazingly determines the identities of two cards. I'm stuck on how the or/union operation gets implemented in an interesting way. Finally, for large optimal codes we expect the frequencies of the 2^n possible outcomes of their pairwise unions to be as uniform as possible (so as to maximize the entropy). But that means the frequency of 1's in the first (or any) position should be about 1-1/sqrt(2) = .293 in order for the pairwise-or distribution in that position to have equal probability of 0 and 1. This suggests the heuristic of tuning the weight of codewords near .293n in searches. -Veit On Dec 15, 2012, at 11:17 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
In case anyone else is still interested in rats and wine bottles:
Last weekend I C++ified my search for "capped union codes" with symmetry. Top-line result: I haven't found anything that beats the 46/4 code that I posted here on 11/23 (and gets down to discarding 88 of the 1000 bottles of wine). I did find a unique-union code of size 22, beating Thomas's 21, but this only brings a 96-bottle solution down to 92 bottles.
If we restrict our attention, as Thomas proposed, to codes that have full 10-fold cyclic symmetry, then with a cap of k=2/3/4/5/6/7/8/9 the maximal size of a capped union code is 21/31/46/56/66/76/86/96. Hooray for exhaustive search. (As before, a cap of k means that for any word w that is a pairwise union of two (not necessarily distinct) words in the code, there are at most k codewords that can participate in such a pairwise union to form w.) For rats and bottles, the 46/4 codes are far better than anything else.
We could loosen our symmetry requirement, though, and allow just 5-fold symmetry instead. (Just to be clear, that means that if e.g. the word 111100000 is in our code, then so are the four other words you get by rotating it 2 bits at a time: 0011110000, 0000111100, 0000001111, 1100000011).
Unfortunately, this makes the search space quite a bit larger. With 10-fold symmetry and a cap of 2/3/4, the number of locally maximal codes is 120 / 779 / 12489. If we only demand 5-fold symmetry, the corresponding sizes are 20215, ~1.7 million, ~220 million. These size estimates come from a clever observation I learned from Knuth at some point: the number of leaves in a tree is the average value of the "product of number of children" statistic for a random path down through the tree, where "random" means "start at the root and from each node pick uniformly at random from its children until you reach a leaf."
I ran the exhaustive search for 5-fold cap=2 and didn't break the code size record of 21 already attained by 10-fold symmetric codes. For cap=3, though, with 5-fold symmetry you can inch up to a code of size 32, instead of 31! One of many such codes: the six words {0000000101, 0000001111, 0000100011, 0001010011, 0001100110, 0010001001} plus the symmetry-required four rotations of each, along with two singletons {0000000000} and {1010101010}.
For cap=4 the search is still running, but it's nearly done and has not exceeded the 46/4 record. It has produced fully 494 codes of that size, though. I've tried higher caps and the search has become much too long to wait for it to finish, but for the bits of the space I have explored, not much to report: I haven't beaten the 10-fold rotation record in any case other than cap=3.
Finally, we can go wholer hog and ask about just 2-fold rotational symmetry! For cap=2 the search space is around 36 billion and for cap=4 it seems to be size 10^18, so, um, not going to explore the whole thing. (That would be 80K years if we could explore 1 code per microsecond!) However, I have found some 22/2 codes this way! For example: {1, 32, 6, 192, 24, 768, 75, 354, 108, 387, 149, 676, 178, 581, 269, 424, 308, 649, 337, 554, 531, 624}. I've written codewords as decimal integers in [0,1023] this time, to save space; I'm sure you can all translate to binary yourselves. These 22/2 codes aren't particularly rare; I've seen hundreds of them.
There! Aren't you glad you asked?
--Michael _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
On Tue, 18 Dec 2012, Veit Elser wrote:
I also haven't found any improvements to the 46/4 code. But here are a couple of curiosities.
...
For fun I looked at n=13 and found a remarkable 53/2 code. One of the code words is the zero codeword and the others are orbits under the dihedral group (order 26) of the codewords
{0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1} {0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1}
...
The 52/2 code you get by taking away the zero codeword might be the basis of a card trick where the magician amazingly determines the identities of two cards. I'm stuck on how the or/union operation gets implemented in an interesting way.
Well, you can put one joker back in to get 53, and assign the joker to the 0 word. You could ask 13 questions of the form, "Does either card...?" Each of the two patterns above and their flips can correspond to a suit, and the cyclic permutations of the answers correspond to the ranks, 1 (=A) to 13 (=K). Of course the questions will be pretty lame, like, "Is either card an ace, a two, a 4C, a 10C, a 5D, a 7D, a 6H, a QH, a 9S, or a JS?" and cyclic shifts of this. Maybe there's some more elegant way to do this. In fact, you could put both jokers in, if they both called just "joker", and identify the pair out of a full deck.
Finally, for large optimal codes we expect the frequencies of the 2^n possible outcomes of their pairwise unions to be as uniform as possible (so as to maximize the entropy). But that means the frequency of 1's in the first (or any) position should be about 1-1/sqrt(2) = .293 in order for the pairwise-or distribution in that position to have equal probability of 0 and 1. This suggests the heuristic of tuning the weight of codewords near .293n in searches.
Note that for n = 13, this gives 3.81, which is very close to the weight 4 that the codewords above have and even closer to the average of 3.92 you get when you include the 0 codeword. David P. Moulton
Merry Christmas! My present to you: four more bottles of wine. Last night my program turned up a 10-bit cap=4 union code with 48 codewords. The 1000 bottles of wine can therefore be assorted into 48 bins with at most 21 bottles in each bin, and we will need to throw away at most 84 bottles. The search through codes with 5-fold rotation symmetry did indeed finish as expected with lots of 46/4 codes but nothing better, and the search space for codes with 2-fold rotation symmetry is too large. In between (in size, not in the subgroup lattice) there's the space of codes with Z2 x Z2 symmetry: fixed under rotation by five positions and under reversing the whole string. That is, if the code contains word [abcde fghij], then it also contains [fghij abcde], [jihgf edcba], and [edcba jihgf]. (As usual, some of these might coincide, so the class might have only 1 or 2 distinct words instead of 4.) That search has turned up both a 47/4 code and a 48/4 code. The 48/4 consists of ten 4-word symmetry classes and four 2-word classes: 4-word: 0000000101, 0000000110, 0000001001, 0000100010, 0000101000, 0000111001, 0001011100, 0001100110, 0010110010, 0100101010 2-word: 0010001110, 0010010001, 0010010101, 0000100001 The fact that a code having this 4-fold symmetry, which singles out one pair of bits (c and h above), beats the best codes with the much more uniform 5-fold rotational symmetry, gives me the impression that symmetry is useful more because it cuts down the search space than because large codes are likely to be highly symmetric. --Michael -- Forewarned is worth an octopus in the bush.
Great results! ]The fact that a code having this 4-fold symmetry, which singles out one
pair of bits (c and h above), beats the best codes with the much more uniform 5-fold rotational symmetry, gives me the impression that symmetry is useful more because it cuts down the search space than because large codes are likely to be highly symmetric.
--Michael
I sort of half-agree. With the Rubik's cube, positions with symmetry tend to have more extremal properties because there are fewer "distinct" paths to start (for instance); a similar thing may be happening here. So not only does symmetry cut down the search, but symmetrical positions also tend to have more interesting properties. -tom -- -- http://cube20.org/ -- http://golly.sf.net/ --
Happy 2012.25 ! While I can't offer you any more bottles of wine, here is a 49/4 = 12.25 code that beats Michael's 48/4: 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1 0 1 1 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 No symmetry was used in its construction, only a very simple heuristic. I'm calling the algorithm wax & wane. Details next year! -Veit On Dec 25, 2012, at 2:57 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Merry Christmas! My present to you: four more bottles of wine. Last night my program turned up a 10-bit cap=4 union code with 48 codewords. The 1000 bottles of wine can therefore be assorted into 48 bins with at most 21 bottles in each bin, and we will need to throw away at most 84 bottles.
The search through codes with 5-fold rotation symmetry did indeed finish as expected with lots of 46/4 codes but nothing better, and the search space for codes with 2-fold rotation symmetry is too large. In between (in size, not in the subgroup lattice) there's the space of codes with Z2 x Z2 symmetry: fixed under rotation by five positions and under reversing the whole string. That is, if the code contains word [abcde fghij], then it also contains [fghij abcde], [jihgf edcba], and [edcba jihgf]. (As usual, some of these might coincide, so the class might have only 1 or 2 distinct words instead of 4.)
That search has turned up both a 47/4 code and a 48/4 code. The 48/4 consists of ten 4-word symmetry classes and four 2-word classes: 4-word: 0000000101, 0000000110, 0000001001, 0000100010, 0000101000, 0000111001, 0001011100, 0001100110, 0010110010, 0100101010 2-word: 0010001110, 0010010001, 0010010101, 0000100001
The fact that a code having this 4-fold symmetry, which singles out one pair of bits (c and h above), beats the best codes with the much more uniform 5-fold rotational symmetry, gives me the impression that symmetry is useful more because it cuts down the search space than because large codes are likely to be highly symmetric.
--Michael
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Wonderful! I look forward to an explanation. Glad to see some approach that doesn't rely on my artificial symmetry constraint. This does actually get at least one extra bottle of wine, thanks to rounding. If you assign 20 bottles to each of the codewords 1, 6, 10, 17, 20, 30, 34, 37, 48, 66, 102, 131, 132, 136, 146, 160, 172, 192, 260, 264, 288, 390, 516, 544, 576, 674, 768, 778, 960 (writing decimal for their binary representations, for convenience) and 21 bottles to codewords 43, 72, 89, 197, 240, 269, 306, 323, 340, 360, 408, 417, 519, 568, 588, 594, 609, 649, 660, 785, then the worst cases involve throwing away only 83 bottles. (Not sure if 82 is obtainable; this 83 was just the first thing I tried.) --Michael On Mon, Dec 31, 2012 at 6:57 PM, Veit Elser <ve10@cornell.edu> wrote:
Happy 2012.25 !
While I can't offer you any more bottles of wine, here is a 49/4 = 12.25 code that beats Michael's 48/4:
0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1 0 1 1 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0
No symmetry was used in its construction, only a very simple heuristic. I'm calling the algorithm wax & wane. Details next year!
-Veit
On Dec 25, 2012, at 2:57 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Merry Christmas! My present to you: four more bottles of wine. Last night my program turned up a 10-bit cap=4 union code with 48 codewords. The 1000 bottles of wine can therefore be assorted into 48 bins with at most 21 bottles in each bin, and we will need to throw away at most 84 bottles.
The search through codes with 5-fold rotation symmetry did indeed finish as expected with lots of 46/4 codes but nothing better, and the search space for codes with 2-fold rotation symmetry is too large. In between (in size, not in the subgroup lattice) there's the space of codes with Z2 x Z2 symmetry: fixed under rotation by five positions and under reversing the whole string. That is, if the code contains word [abcde fghij], then it also contains [fghij abcde], [jihgf edcba], and [edcba jihgf]. (As usual, some of these might coincide, so the class might have only 1 or 2 distinct words instead of 4.)
That search has turned up both a 47/4 code and a 48/4 code. The 48/4 consists of ten 4-word symmetry classes and four 2-word classes: 4-word: 0000000101, 0000000110, 0000001001, 0000100010, 0000101000, 0000111001, 0001011100, 0001100110, 0010110010, 0100101010 2-word: 0010001110, 0010010001, 0010010101, 0000100001
The fact that a code having this 4-fold symmetry, which singles out one pair of bits (c and h above), beats the best codes with the much more uniform 5-fold rotational symmetry, gives me the impression that symmetry is useful more because it cuts down the search space than because large codes are likely to be highly symmetric.
--Michael
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
Oh hey, my Z2xZ2 symmetry search just turned up a 49/4 also: 5,160,640,20,6,192,384,12,10,320,33,528,34,65,272,520,40,257,80,514,51,609,816,537,60,897,240,519,66,264,77,418,712,278,90,834,360,267,102,195,408,780,132,145,548,169,293,596,658 (Note the presence of 132 = 0010000100, a symmetry orbit of size 1, allowing the code to be of odd size.) Still eager to see Veit's approach, though; I remain afraid that this symmetry thing is ultimately a distraction. --Michael On Tue, Jan 1, 2013 at 6:24 PM, Michael Kleber <michael.kleber@gmail.com>wrote:
Wonderful! I look forward to an explanation. Glad to see some approach that doesn't rely on my artificial symmetry constraint.
This does actually get at least one extra bottle of wine, thanks to rounding. If you assign 20 bottles to each of the codewords 1, 6, 10, 17, 20, 30, 34, 37, 48, 66, 102, 131, 132, 136, 146, 160, 172, 192, 260, 264, 288, 390, 516, 544, 576, 674, 768, 778, 960 (writing decimal for their binary representations, for convenience) and 21 bottles to codewords 43, 72, 89, 197, 240, 269, 306, 323, 340, 360, 408, 417, 519, 568, 588, 594, 609, 649, 660, 785, then the worst cases involve throwing away only 83 bottles. (Not sure if 82 is obtainable; this 83 was just the first thing I tried.)
--Michael
On Mon, Dec 31, 2012 at 6:57 PM, Veit Elser <ve10@cornell.edu> wrote:
Happy 2012.25 !
While I can't offer you any more bottles of wine, here is a 49/4 = 12.25 code that beats Michael's 48/4:
0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1 0 1 1 0 1 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 1 0 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 1 0 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 0 1 1 0 0 1 0 0 1 1 0 0 0 0 1 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0
No symmetry was used in its construction, only a very simple heuristic. I'm calling the algorithm wax & wane. Details next year!
-Veit
On Dec 25, 2012, at 2:57 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Merry Christmas! My present to you: four more bottles of wine. Last night my program turned up a 10-bit cap=4 union code with 48 codewords. The 1000 bottles of wine can therefore be assorted into 48 bins with at most 21 bottles in each bin, and we will need to throw away at most 84 bottles.
The search through codes with 5-fold rotation symmetry did indeed finish as expected with lots of 46/4 codes but nothing better, and the search space for codes with 2-fold rotation symmetry is too large. In between (in size, not in the subgroup lattice) there's the space of codes with Z2 x Z2 symmetry: fixed under rotation by five positions and under reversing the whole string. That is, if the code contains word [abcde fghij], then it also contains [fghij abcde], [jihgf edcba], and [edcba jihgf]. (As usual, some of these might coincide, so the class might have only 1 or 2 distinct words instead of 4.)
That search has turned up both a 47/4 code and a 48/4 code. The 48/4 consists of ten 4-word symmetry classes and four 2-word classes: 4-word: 0000000101, 0000000110, 0000001001, 0000100010, 0000101000, 0000111001, 0001011100, 0001100110, 0010110010, 0100101010 2-word: 0010001110, 0010010001, 0010010101, 0000100001
The fact that a code having this 4-fold symmetry, which singles out one pair of bits (c and h above), beats the best codes with the much more uniform 5-fold rotational symmetry, gives me the impression that symmetry is useful more because it cuts down the search space than because large codes are likely to be highly symmetric.
--Michael
-- Forewarned is worth an octopus in the bush. _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
-- Forewarned is worth an octopus in the bush.
Thanks for pointing out the rounding issue; it's gratifying to know I helped in a small way to salvage another bottle of wine! I spent a couple of hours turning my Mma program into C code, in case anyone wants to play with it. It generates 49/4 codes at a rate of about 1 per minute. Here's a brief description of the algorithm: There are two operations, wax and wane, in each cycle. As the names suggest, the code grows in size during wax and shrinks during wane. Wax is the simpler of the two: unused codewords are added as long as the "cap" on the code (4 in the case of 49/4) is not exceeded. This introduces randomness, because the algorithm tries the unused codewords in a random (not lexicographic) order. During wane, codewords are deleted that have an overabundant representation in the unions that are at their capped values. The rationale is that growth of the code is more likely when fewer unions are capped, and if the rule is to always delete the same number d of codewords, one should preferentially delete those that "un-cap" the greatest number of capped unions. So the algorithm assigns a cost to each codeword equal to the number of capped unions it contributes to and then deletes the d codewords having the highest cost. Starting from the empty code the algorithm performs many wax and wane cycles and outputs the code whenever an increased size is found. During wax there is no limit to the growth of the code, while the size always shrinks by d during wane. I get good results for the 49/4 code when d=5. -Veit On Jan 2, 2013, at 7:46 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
Oh hey, my Z2xZ2 symmetry search just turned up a 49/4 also:
5,160,640,20,6,192,384,12,10,320,33,528,34,65,272,520,40,257,80,514,51,609,816,537,60,897,240,519,66,264,77,418,712,278,90,834,360,267,102,195,408,780,132,145,548,169,293,596,658
(Note the presence of 132 = 0010000100, a symmetry orbit of size 1, allowing the code to be of odd size.)
Still eager to see Veit's approach, though; I remain afraid that this symmetry thing is ultimately a distraction.
--Michael
On Tue, Nov 20, 2012 at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com>wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Nice approach! I have a variant that gets down to throwing away 88 bottles. It looks Thomas's size-21 is the best you can do with circularly-symmetric unique union codes. His is one of 16 examples. We can generalize the unique-union idea, though, to codes in which a given pairwise union of codewords can arise from pairwise unions involving at most k different words. The unique-union case is where k=2, meaning that if we partition wine bottles into bins labeled by codewords, we can feed them to rats and throw away at most k=2 bins. If we relax this to k=4, there are circularly-symmetric codes containing 46 as many as words. I found ten examples, of which one is: 0000000000 0000000101 and its circular shifts (ten total) 0000011011 likewise 0001000111 likewise 0001001011 likewise 0000100001 and its circular shifts (five total, as this word has a rotational symmetry itself) Note that the k=4 condition is *not* the same as "there are at most two ways to form each union". This code, for example, has one type of codeword, A=0001000111, which is a union of two others, B=0000000101 and C=0001000010. So if the dead rats are codeword A, then there are a bunch of possibilities for where the poisoned bottles are: A+A, A+B, A+C, A+0, or B+C. But in any case, you throw away the wine in bins A, B, C, and 0. Since there are 46 codewords, you divide wine up into 46 bins with at most ceiling(1000/46) = 22 wine bottles in each, and throw away at most 4 bins, so 88 bottles. (I didn't see a way to arrange the 34 21-bottle bins vs the 12 22-bottle bins to pull this down to 87, but perhaps I missed it.) --Michael
Michael, I imagine there is something wrong with this argument, but what is it? Take for your sets (codewords) all of the binomial(10,2) = 45 2 elements sets {i,j} with i < j. Note that the union of any two sets has 4 or 3 elements so can be written as a union of two 2-sets in at most 3 different ways. This gives at most 3*ceil(1000/45) = 69 doubtful bottles. --Edwin On Fri, Nov 23, 2012 at 10:07 AM, Michael Kleber <michael.kleber@gmail.com>wrote:
On Tue, Nov 20, 2012 at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com
wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Nice approach! I have a variant that gets down to throwing away 88 bottles.
It looks Thomas's size-21 is the best you can do with circularly-symmetric unique union codes. His is one of 16 examples.
We can generalize the unique-union idea, though, to codes in which a given pairwise union of codewords can arise from pairwise unions involving at most k different words. The unique-union case is where k=2, meaning that if we partition wine bottles into bins labeled by codewords, we can feed them to rats and throw away at most k=2 bins.
If we relax this to k=4, there are circularly-symmetric codes containing 46 as many as words. I found ten examples, of which one is:
0000000000 0000000101 and its circular shifts (ten total) 0000011011 likewise 0001000111 likewise 0001001011 likewise 0000100001 and its circular shifts (five total, as this word has a rotational symmetry itself)
Note that the k=4 condition is *not* the same as "there are at most two ways to form each union". This code, for example, has one type of codeword, A=0001000111, which is a union of two others, B=0000000101 and C=0001000010. So if the dead rats are codeword A, then there are a bunch of possibilities for where the poisoned bottles are: A+A, A+B, A+C, A+0, or B+C. But in any case, you throw away the wine in bins A, B, C, and 0.
Since there are 46 codewords, you divide wine up into 46 bins with at most ceiling(1000/46) = 22 wine bottles in each, and throw away at most 4 bins, so 88 bottles. (I didn't see a way to arrange the 34 21-bottle bins vs the 12 22-bottle bins to pull this down to 87, but perhaps I missed it.)
--Michael _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
When your union has four 1s, it's certainly true that there are only three different ways to write it as union of two codewords. But as those three decompositions have no overlap, you'll need to throw away six bins. --Michael On Nov 24, 2012 2:33 PM, "W. Edwin Clark" <wclark@mail.usf.edu> wrote:
Michael,
I imagine there is something wrong with this argument, but what is it?
Take for your sets (codewords) all of the binomial(10,2) = 45 2 elements sets {i,j} with i < j. Note that the union of any two sets has 4 or 3 elements so can be written as a union of two 2-sets in at most 3 different ways. This gives at most 3*ceil(1000/45) = 69 doubtful bottles.
--Edwin
On Fri, Nov 23, 2012 at 10:07 AM, Michael Kleber <michael.kleber@gmail.com>wrote:
On Tue, Nov 20, 2012 at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com
wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Nice approach! I have a variant that gets down to throwing away 88 bottles.
It looks Thomas's size-21 is the best you can do with circularly-symmetric unique union codes. His is one of 16 examples.
We can generalize the unique-union idea, though, to codes in which a given pairwise union of codewords can arise from pairwise unions involving at most k different words. The unique-union case is where k=2, meaning that if we partition wine bottles into bins labeled by codewords, we can feed them to rats and throw away at most k=2 bins.
If we relax this to k=4, there are circularly-symmetric codes containing 46 as many as words. I found ten examples, of which one is:
0000000000 0000000101 and its circular shifts (ten total) 0000011011 likewise 0001000111 likewise 0001001011 likewise 0000100001 and its circular shifts (five total, as this word has a rotational symmetry itself)
Note that the k=4 condition is *not* the same as "there are at most two ways to form each union". This code, for example, has one type of codeword, A=0001000111, which is a union of two others, B=0000000101 and C=0001000010. So if the dead rats are codeword A, then there are a bunch of possibilities for where the poisoned bottles are: A+A, A+B, A+C, A+0, or B+C. But in any case, you throw away the wine in bins A, B, C, and 0.
Since there are 46 codewords, you divide wine up into 46 bins with at most ceiling(1000/46) = 22 wine bottles in each, and throw away at most 4 bins, so 88 bottles. (I didn't see a way to arrange the 34 21-bottle bins vs the 12 22-bottle bins to pull this down to 87, but perhaps I missed it.)
--Michael _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I like Michael's generalization, because it also defines the original k=2 case in a cleaner way. So in case you missed it: A union code with parameters (n,k) is a set of codewords comprising subsets of n symbols with the property that any pairwise union of (not necessarily distinct) codewords can arise from pairwise unions of at most k codewords. Let M(n,k) be the maximum size code with parameters (n,k). For the poisoned bottle-pair application, in the limit of many bottles, we are interested in maximizing M(n,k)/k. Michael found a k=4 code for n=10 of size 46, with 46/4=11.5 beating Thomas' k=2 code of size 21 (21/2=10.5). I was curious when k>2 starts outperforming k=2 and wrote an exhaustive search program that gave the following results: n=2: k 2 3 4 M(2,k) 3 3 4 M(2,k)/k 1.5 1 1 n=3: k 2 3 4 5 6 M(3,k) 4 4 6 6 7 M(3,k)/k 2 1.33 1.5 1.2 1.17 n=4: k 2 3 4 5 6 M(4,k) 5 6 9 9 11 M(4,k)/k 2.5 2 2.25 1.8 1.83 n=5: k 2 3 4 5 6 M(5,k) 6 8 12 13 16 M(5,k)/k 3 2.67 3 2.6 2.67 n=6: k 2 3 4 5 6 M(6,k) 8 12 16 18* 22* M(6,k)/k 4 4 4 3.6 3.67 n=7: k 2 3 4 5 6 M(7,k) 10 14* 20* 24* 29* M(7,k)/k 5 4.67* 5* 4.8* 4.83* The numbers marked * are lower bounds. As you can see, the results up to n=7 are inconclusive: only ties for k>2 have been found. The best hope looks like n=7, k=4. I'll send my C-program to anyone wishing to push this further. One can make a loose analogy between log(M(n,k)/k) = log M(n,k) - log k and Shannon's mutual information for a noisy channel. In a sense we are designing a code that maximizes the information about the poisoned bottle-pair we get from the "union-signal" of rat mortality. In this sense, log k is the "equivocation" of the code. -Veit On Nov 24, 2012, at 2:40 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
When your union has four 1s, it's certainly true that there are only three different ways to write it as union of two codewords. But as those three decompositions have no overlap, you'll need to throw away six bins.
--Michael On Nov 24, 2012 2:33 PM, "W. Edwin Clark" <wclark@mail.usf.edu> wrote:
Michael,
I imagine there is something wrong with this argument, but what is it?
Take for your sets (codewords) all of the binomial(10,2) = 45 2 elements sets {i,j} with i < j. Note that the union of any two sets has 4 or 3 elements so can be written as a union of two 2-sets in at most 3 different ways. This gives at most 3*ceil(1000/45) = 69 doubtful bottles.
--Edwin
On Fri, Nov 23, 2012 at 10:07 AM, Michael Kleber <michael.kleber@gmail.com>wrote:
On Tue, Nov 20, 2012 at 10:35 PM, Thomas Colthurst <thomaswc@gmail.com
wrote:
I found an "unique union" code for n=10 with size 21, just by taking 0000000000 along with all of the circular shifts of 0000001011 and 0001100101.
Since 1000 / 21 = 47.6 ..., this gives a rats solution that loses at most 2*48 = 96 bottles.
-Thomas C
Nice approach! I have a variant that gets down to throwing away 88 bottles.
It looks Thomas's size-21 is the best you can do with circularly-symmetric unique union codes. His is one of 16 examples.
We can generalize the unique-union idea, though, to codes in which a given pairwise union of codewords can arise from pairwise unions involving at most k different words. The unique-union case is where k=2, meaning that if we partition wine bottles into bins labeled by codewords, we can feed them to rats and throw away at most k=2 bins.
If we relax this to k=4, there are circularly-symmetric codes containing 46 as many as words. I found ten examples, of which one is:
0000000000 0000000101 and its circular shifts (ten total) 0000011011 likewise 0001000111 likewise 0001001011 likewise 0000100001 and its circular shifts (five total, as this word has a rotational symmetry itself)
Note that the k=4 condition is *not* the same as "there are at most two ways to form each union". This code, for example, has one type of codeword, A=0001000111, which is a union of two others, B=0000000101 and C=0001000010. So if the dead rats are codeword A, then there are a bunch of possibilities for where the poisoned bottles are: A+A, A+B, A+C, A+0, or B+C. But in any case, you throw away the wine in bins A, B, C, and 0.
Since there are 46 codewords, you divide wine up into 46 bins with at most ceiling(1000/46) = 22 wine bottles in each, and throw away at most 4 bins, so 88 bottles. (I didn't see a way to arrange the 34 21-bottle bins vs the 12 22-bottle bins to pull this down to 87, but perhaps I missed it.)
--Michael _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I frittered away some time (both mine and a computer's) on a useless attempt to solve the rats problem via linear programming: a solution is described by a choice of 2^10 integer values b_0, b_1,..., b_1023, where b_k is the number of bottles drunk by exactly rats all-the-1s-in-the-binary-expansion-of-k. The number of bottles you throw away isn't actually the max of a bunch of linear functions of the b's, though -- not until you decide which b's are 0 versus nonzero. At that point you can invoke linear programming, but picking which b's are nonzero is the hard part. I did just look back at this message of Veit's from a few days ago, though, and my "cyclic capped union code" code can answer one small open question: On Sun, Nov 25, 2012 at 2:26 PM, Veit Elser <ve10@cornell.edu> wrote:
A union code with parameters (n,k) is a set of codewords comprising subsets of n symbols with the property that any pairwise union of (not necessarily distinct) codewords can arise from pairwise unions of at most k codewords.
Let M(n,k) be the maximum size code with parameters (n,k). For the poisoned bottle-pair application, in the limit of many bottles, we are interested in maximizing M(n,k)/k.
Michael found a k=4 code for n=10 of size 46, with 46/4=11.5 beating Thomas' k=2 code of size 21 (21/2=10.5). I was curious when k>2 starts outperforming k=2 and wrote an exhaustive search program that gave the following results: [...] n=7: k 2 3 4 5 6 M(7,k) 10 14* 20* 24* 29* M(7,k)/k 5 4.67* 5* 4.8* 4.83*
The numbers marked * are lower bounds. As you can see, the results up to n=7 are inconclusive: only ties for k>2 have been found. The best hope looks like n=7, k=4.
Hey, my trivial Mathematica program can do that: for n=7, there is a 21-element cyclic capped union code with k=4, consisting of the circular shifts of {0000001, 0001011, 0001101}. No reason to think that forcing cyclic symmetry is optimal, but in this case it's good enough. So now we have at least one case where the more-general capped union code out-performs the unique union code. --Michael -- Forewarned is worth an octopus in the bush.
On Nov 28, 2012, at 8:55 PM, Michael Kleber <michael.kleber@gmail.com> wrote:
k 2 3 4 5 6 M(7,k) 10 14* 20* 24* 29* M(7,k)/k 5 4.67* 5* 4.8* 4.83*
The numbers marked * are lower bounds. As you can see, the results up to n=7 are inconclusive: only ties for k>2 have been found. The best hope looks like n=7, k=4.
Hey, my trivial Mathematica program can do that: for n=7, there is a 21-element cyclic capped union code with k=4, consisting of the circular shifts of {0000001, 0001011, 0001101}. No reason to think that forcing cyclic symmetry is optimal, but in this case it's good enough.
So now we have at least one case where the more-general capped union code out-performs the unique union code.
--Michael
Thanks for settling that question. I raised the upper bound for n=7, k=3 from 14 to 15, giving yet another tie. Your 21-element code achieves another milestone of sorts. The "rate" of these rat-codes is log_2( N(n,k)/k ) / n bits, where N(n,k) is the number of codewords. This reaches 1/3 already for the n=3 code, and in fact Grossman's product construction has this as the asymptotic rate. Your n=7 code gives the first improvement: 0.34176. The best rate so far is given by your 46 element n=10, k=4 code: 0.3523. Have you noticed that the transposes of your cyclic codes are always equal weight and equidistant? Your latest has 7 words of length 21, all of weight 7 and equal distance 10 to all the other words. This led me to try out "random" codes with this property and check them for the union property. While this works for n=8, k=2 (and other small cases), it fails for n=10, k=2 (a small fraction of the unions are generated from more than k words). Somehow your cyclic codes are capturing something else that's important. -Veit
Yes, this is a cute bound. And you can improve it by 1 by including the empty set, so a(n)+1 <= M(n). -Veit On Nov 19, 2012, at 10:46 PM, W. Edwin Clark <wclark@mail.usf.edu> wrote:
As a matter of curiosity a rather poor (but cute) lower bound can be obtained for the unique union code size by restricting oneself to two element subsets of the given n-set. One gets a classical graph theory problem, namely:
If a(n) is the maximum number of edges in an n-vertex graph of girth 5. Then a(n) <= M(n).
If G is a (simple) graph on n vertices with girth 5 then G has no 3-cycles or 4-cycles. This implies that the set of edges is a unique union code on n symbols. [Here an edge is a 2 element set of vertices.] Thus a(n) <= M(n). In particular, the value a(10) = 15 is given by the Petersen graph<http://en.wikipedia.org/wiki/Petersen_graph>.
Values of a(n) for n to 30 are given in http://oeis.org/A006856. The extremal graphs can be found in the paper Extremal graphs without three-cycles or four-cycles<http://www.math.udel.edu/%7Elazebnik/papers/ex34a_v2.pdf>by Garnick, et al.
--Edwin
participants (12)
-
Allan Wechsler -
Dan Asimov -
David P. Moulton -
Hans Havermann -
J.P. Grossman -
Michael Kleber -
Mike Speciner -
Thomas Colthurst -
Tom Rokicki -
Veit Elser -
W. Edwin Clark -
Warren Smith