Re: [math-fun] Testing N people for viruses with log(N) tests

18 Apr 2020


      We (Ginkgo) are scaling up to doing 1M tests/day, and will likely rely on some amount of pooling to achieve this.
Yaniv is correct that a major (perhaps the major) challenge is high throughput sample collection not analysis.
The book suggestion on non-sequential pooling approaches looks as if it could be helpful, though frankly most of the practical approaches were pretty obvious.
...
On Apr 18, 2020, at 11:30 AM, Victor Miller <victorsmiller@gmail.com> wrote:
This has actually occurred to a lot of people.  Yaniv Erlich (a molecular
biologist) said on March 13:
<https://twitter.com/erlichya>
Yaniv (((Erlich)))
@erlichya
<https://twitter.com/erlichya>
·
Mar 13 <https://twitter.com/erlichya/status/1238522385952899074>
Guys, this was the main topic of my phd (group testing) and it isa terrible
idea. The current shortage of tests is due to kiting (swabs, vials, etc)
not qPCR machines. So pools don't solve this problem
Quote Tweet
Eran Segal
@segal_eran
· Mar 12
Is anyone doing COVID-19 testing on pools of samples from many people? If
<0.1% are infected, most tests of 100 pooled samples will be negative. Only
positive pools need further tests Accuracy per test is lower but this
allows rapid, cheap, large-scale testing Thoughts?
On Sat, Apr 18, 2020 at 10:23 AM Henry Baker <hbaker1@pipeline.com> wrote:
...
Excellent idea, but acquiring & mixing the samples may be just as
expensive as the test itself.
Better to sample and test *sewage* at strategic branching locations
in the sewer system, where the "mixing" has already taken place.
(Yes, fecal matter does contain the virus.)
Testing sewage has already been proposed, but for some reason, it
still hasn't been done yet.
---
BTW, to avoid the problem of one test depending upon another, one
can also do "oblivious" testing, using *sorting networks* to get
a O((logn)^2) algorithm.
At 09:30 PM 4/17/2020, Keith F. Lynch wrote:
...
For some reason there's still an extreme shortage of covid-19 tests.
If only everyone could be tested, preferably at least once a week,
most of social isolation could end.  Negative people could associate
with negatives.  Positive people could associate with positives.
People who were formerly positive but are now negative could associate
with everyone, especially if they test positive for antibodies.
Fortunately, it's not necessary to have N tests to test N people
unless the expected infection rate is close to 50%.  If the expected
infection rate is, say, one in a hundred, samples can be taken from
128 people, and each sample thoroughly stirred then divided into 7
equal parts.  The first part from all 128 would be mixed together,
thoroughly stirred, then tested with one test.  If the test is
negative, then all 128 people are negative, and we're done.
If that test is positive, the second part from 64 of the 128 would be
mixed together and tested, and the second part from the other 64 of
the 128 would be mixed together and tested, requiring two more tests.
Probably one of the tests would be negative, in which case those 64
would be negative.  The process would then continue with the third,
fourth, etc., parts, subdividing it further and further to find the
probably just one or two people in the set who test positive.
The identical approach could be used with the test for antibodies.
The sets of 128 should be chosen so that no two family members or
coworkers are in the same set, to avoid sets with a high likelihood
of correlations.  If some sets have half the people in them infected,
that will result in a lot more tests being needed, more than would
be compensated for by a similar number other sets having nobody in
them infected.
I'll leave it to others to calculate what the expected number of tests
needed would be, but it's obviously O(log2(N)) where the expected
infection rate is 1/N.
We're accustomed to thinking in terms of terabytes and petabytes.
But it's worth remembering that fractional bits of information are
meaningful too.  For instance if just one person in the world is
expected to test positive for something (e.g. for being the one who
left DNA at a specific crime scene), learning which person that is is
only 33 bits of information.  That means that testing any one person
provides just 4 nanobits of information, hence that relatively few
tests which provide one bit of information need to be used.
Of course the above assumes that the tests are reliable.  Testing in
this way amplifies any unreliability in either direction.  However,
this can be compensated for by multiple tests in ways that are much
more efficient than simply testing everyone multiple times.  Just as
unreliable data transmissions can be made more reliable in ways more
efficient than simply repeating the transmission multiple times.
Parity bits and CRC checks can reduce the error rate of a data
transmission to as low as is desired (albeit never quite to zero).
I wonder if the reason why doctors aren't doing testing in the way I
suggest is because of specialization.  Doctors need to talk to experts
on data transmission and encoding, such as the inventors of the JT65
mode which made it possible for hams with battery-operated radios and
hand-held antennas to communicate by bouncing signals of the moon.
With larger antennas, this has been done with a transmitter of just
3 milliwatts.
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Re: [math-fun] Testing N people for viruses with log(N) tests

Tom Knight