[math-fun] Testing N people for viruses with log(N) tests
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies. Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done. If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive. The identical approach could be used with the test for antibodies. The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected. I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N. We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used. Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times. Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
The proposed Fast Fourier Blood Test sounds promising, especially in a test-limited scenario, but the other major assumption is that blood can be mixed without inducing false negatives. I don’t know enough about blood chemistry to say for sure what would happen. What if person A has antibodies and person B has the virus? Can the antibodies kill the virus in the mixed Blood solution? In any case, studies would need to be done to support the idea that mixing blood in a chemical solution is essentially the same thing as mixing inert glass beads in a Bernoulli urn. —Brad
On Apr 17, 2020, at 11:31 PM, Keith F. Lynch <kfl@keithlynch.net> wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
Additionally, what you choose to test in each "round" is contingent on the results of the previous rounds. Doesn't the current test take one to two days to run? Could you keep your samples stable over the one to two weeks that it would take to complete the entire batch? On Sat, Apr 18, 2020, 04:25 Brad Klee <bradklee@gmail.com> wrote:
The proposed Fast Fourier Blood Test sounds promising, especially in a test-limited scenario, but the other major assumption is that blood can be mixed without inducing false negatives.
I don’t know enough about blood chemistry to say for sure what would happen. What if person A has antibodies and person B has the virus? Can the antibodies kill the virus in the mixed Blood solution?
In any case, studies would need to be done to support the idea that mixing blood in a chemical solution is essentially the same thing as mixing inert glass beads in a Bernoulli urn.
—Brad
On Apr 17, 2020, at 11:31 PM, Keith F. Lynch <kfl@keithlynch.net> wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
The assumption here is that either one test can test an arbitrarily large quantity of blood, or that a test is equally accurate if the concentration of the virus in the blood is reduced by a factor of 100 or more. These assumptions, while they make for entertaining math problems, seem unlikely to be realistic. Andy On Sat, Apr 18, 2020 at 12:31 AM Keith F. Lynch <kfl@keithlynch.net> wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
This is the "non-adaptive combinatorial group testing" problem, which this mailing list discussed a bunch as the "rats and poisoned wine bottles problem" in 2012. Veit Elser and I eventually gave some talks about joint work, though nothing relevant to today's real-world situation. Historically, the first mention of the problem was another case of finding sick people without enough tests: testing all WWII draftees for syphilis. There are some good books on what's known, by Du & Hwang: "Combinatorial Group Testing and Its Applications" (1999) and "Pooling Designs and Nonadaptive Group Testing" (2006). My copies are locked away in my office which I can't get to. --Michael On Sat, Apr 18, 2020 at 10:12 AM Andy Latto <andy.latto@pobox.com> wrote:
The assumption here is that either one test can test an arbitrarily large quantity of blood, or that a test is equally accurate if the concentration of the virus in the blood is reduced by a factor of 100 or more. These assumptions, while they make for entertaining math problems, seem unlikely to be realistic.
Andy
On Sat, Apr 18, 2020 at 12:31 AM Keith F. Lynch <kfl@keithlynch.net> wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Andy.Latto@pobox.com
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- Forewarned is worth an octopus in the bush.
Excellent idea, but acquiring & mixing the samples may be just as expensive as the test itself. Better to sample and test *sewage* at strategic branching locations in the sewer system, where the "mixing" has already taken place. (Yes, fecal matter does contain the virus.) Testing sewage has already been proposed, but for some reason, it still hasn't been done yet. --- BTW, to avoid the problem of one test depending upon another, one can also do "oblivious" testing, using *sorting networks* to get a O((logn)^2) algorithm. At 09:30 PM 4/17/2020, Keith F. Lynch wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
This has actually occurred to a lot of people. Yaniv Erlich (a molecular biologist) said on March 13: <https://twitter.com/erlichya> Yaniv (((Erlich))) @erlichya <https://twitter.com/erlichya> · Mar 13 <https://twitter.com/erlichya/status/1238522385952899074> Guys, this was the main topic of my phd (group testing) and it isa terrible idea. The current shortage of tests is due to kiting (swabs, vials, etc) not qPCR machines. So pools don't solve this problem Quote Tweet Eran Segal @segal_eran · Mar 12 Is anyone doing COVID-19 testing on pools of samples from many people? If <0.1% are infected, most tests of 100 pooled samples will be negative. Only positive pools need further tests Accuracy per test is lower but this allows rapid, cheap, large-scale testing Thoughts? On Sat, Apr 18, 2020 at 10:23 AM Henry Baker <hbaker1@pipeline.com> wrote:
Excellent idea, but acquiring & mixing the samples may be just as expensive as the test itself.
Better to sample and test *sewage* at strategic branching locations in the sewer system, where the "mixing" has already taken place. (Yes, fecal matter does contain the virus.)
Testing sewage has already been proposed, but for some reason, it still hasn't been done yet.
---
BTW, to avoid the problem of one test depending upon another, one can also do "oblivious" testing, using *sorting networks* to get a O((logn)^2) algorithm.
At 09:30 PM 4/17/2020, Keith F. Lynch wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
We (Ginkgo) are scaling up to doing 1M tests/day, and will likely rely on some amount of pooling to achieve this. Yaniv is correct that a major (perhaps the major) challenge is high throughput sample collection not analysis. The book suggestion on non-sequential pooling approaches looks as if it could be helpful, though frankly most of the practical approaches were pretty obvious.
On Apr 18, 2020, at 11:30 AM, Victor Miller <victorsmiller@gmail.com> wrote:
This has actually occurred to a lot of people. Yaniv Erlich (a molecular biologist) said on March 13: <https://twitter.com/erlichya> Yaniv (((Erlich))) @erlichya <https://twitter.com/erlichya> · Mar 13 <https://twitter.com/erlichya/status/1238522385952899074> Guys, this was the main topic of my phd (group testing) and it isa terrible idea. The current shortage of tests is due to kiting (swabs, vials, etc) not qPCR machines. So pools don't solve this problem Quote Tweet Eran Segal @segal_eran · Mar 12 Is anyone doing COVID-19 testing on pools of samples from many people? If <0.1% are infected, most tests of 100 pooled samples will be negative. Only positive pools need further tests Accuracy per test is lower but this allows rapid, cheap, large-scale testing Thoughts?
On Sat, Apr 18, 2020 at 10:23 AM Henry Baker <hbaker1@pipeline.com> wrote:
Excellent idea, but acquiring & mixing the samples may be just as expensive as the test itself.
Better to sample and test *sewage* at strategic branching locations in the sewer system, where the "mixing" has already taken place. (Yes, fecal matter does contain the virus.)
Testing sewage has already been proposed, but for some reason, it still hasn't been done yet.
---
BTW, to avoid the problem of one test depending upon another, one can also do "oblivious" testing, using *sorting networks* to get a O((logn)^2) algorithm.
At 09:30 PM 4/17/2020, Keith F. Lynch wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
It is being done. https://www.nature.com/articles/d41586-020-00973-x Our local paper reported on how this will be done statewide, noting the obvious benefits if the project should "bear fruit." The local orchards are having trouble getting workers, surely.
Testing sewage has already been proposed, but for some reason, it still hasn't been done yet.
Hilarie
In case no one has mentioned this, it seems that people have experimented with this approach. From non-technical newspaper accounts: https://www.jpost.com/health-science/acceleration-in-multiple-coronavirus-te... https://www.timesofisrael.com/to-ease-global-virus-test-bottleneck-israeli-s... It sounds like they have verified that they can detect a single positive result in a pool of 64 samples. However, as someone else pointed out here, the shortage may be for swabs, kits, etc. rather than the tests themselves, so even if the test works, it may not relieve the real bottleneck. On 2020-04-17 21:30, Keith F. Lynch wrote:
For some reason there's still an extreme shortage of covid-19 tests. If only everyone could be tested, preferably at least once a week, most of social isolation could end. Negative people could associate with negatives. Positive people could associate with positives. People who were formerly positive but are now negative could associate with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people unless the expected infection rate is close to 50%. If the expected infection rate is, say, one in a hundred, samples can be taken from 128 people, and each sample thoroughly stirred then divided into 7 equal parts. The first part from all 128 would be mixed together, thoroughly stirred, then tested with one test. If the test is negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be mixed together and tested, and the second part from the other 64 of the 128 would be mixed together and tested, requiring two more tests. Probably one of the tests would be negative, in which case those 64 would be negative. The process would then continue with the third, fourth, etc., parts, subdividing it further and further to find the probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or coworkers are in the same set, to avoid sets with a high likelihood of correlations. If some sets have half the people in them infected, that will result in a lot more tests being needed, more than would be compensated for by a similar number other sets having nobody in them infected.
I'll leave it to others to calculate what the expected number of tests needed would be, but it's obviously O(log2(N)) where the expected infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes. But it's worth remembering that fractional bits of information are meaningful too. For instance if just one person in the world is expected to test positive for something (e.g. for being the one who left DNA at a specific crime scene), learning which person that is is only 33 bits of information. That means that testing any one person provides just 4 nanobits of information, hence that relatively few tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable. Testing in this way amplifies any unreliability in either direction. However, this can be compensated for by multiple tests in ways that are much more efficient than simply testing everyone multiple times. Just as unreliable data transmissions can be made more reliable in ways more efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data transmission to as low as is desired (albeit never quite to zero). I wonder if the reason why doctors aren't doing testing in the way I suggest is because of specialization. Doctors need to talk to experts on data transmission and encoding, such as the inventors of the JT65 mode which made it possible for hams with battery-operated radios and hand-held antennas to communicate by bouncing signals of the moon. With larger antennas, this has been done with a transmitter of just 3 milliwatts.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (10)
-
Andy Latto -
Brad Klee -
Henry Baker -
Hilarie Orman -
Keith F. Lynch -
Michael Greenwald -
Michael Kleber -
Tom Knight -
Victor Miller -
William R Somsky