[math-fun] Why R0 is a useless concept

27 May 2020

      I think that we are all in violent agreement that these
epidemiological models are 'ill-conditioned', hence *any*
noise in the input can be dramatically *amplified* in
such a way that it can often overwhelm any 'answer'. 
Analogy: those screeching noises that are often heard
from audio public address systems that have positive
feedback; the screeches often overwhelm the person
speaking.

Re network-simulation Monte Carlo models, e.g., the
Imperial model:

Monte Carlo models require enough iterations/runs in
order to *average out* the sampling noise, *and to fully
"explore" the nether/tail regions of the particular
probability density function*.

The most trivial Monte Carlo model is that of estimating
the *mean* of a distribution by computing statistics
from N samples.  How many samples are required in order
to assure a reasonable estimate of the mean?  Answer:
N ~ O(distribution variance).

OK.  Let's take an oversimplified 'superspreader' model
for R0: 99% of the time, R0=2, and 1% of the time, R0=98.
The mathematical mean of this bimodal distribution is
2.96, and the mathematical variance of this distribution
is ~91.  But I just ran this experimental model and it
takes at least 15,000 random samples of this distribution
just to get a decent approximation to one number -- its
mean!

The reason why so many samples are required is that the
relatively rare event where R0=98 has to occur often
enough to average out against the vastly more probable
R0=2 events.

But we're only getting started.  R0 appears as the *base*
of an exponential in various epidemic models -- e.g.,
(R0)^(a*t), for some constant a.

The most elementary statistics classes don't deal with
*products* of random variables, much less *exponentials*
of random variables.  One simple way to understand such
products and exponentials uses *lognormal* distributions.
If X=L(m,v) is a lognormal distribution with parameters
m,v, then the distribution for the exponential X^n is
L(n*m,n*v).

The mean of L(n*m,n*v) is (exp(m+v/2))^n; the variance of
L(n*m,n*v) is exp(2*m+v)^n*(exp(v)^n-1).  If we choose m,v
to match the mean and variance of the bimodal distribution
above, then m~-0.1322 and v~2.4348, so the mean of X^n is
(2.96)^n and the variance of X^n is (2.96)^(2n)*(11.414^n-1)
~ 100^n.

So what if we have to sample, e.g., (R0)^10, i.e., a*t=10
-- to compute its mean ?  How many samples will we need
to get a decent approximation ?  (Note that this is the
10-fold product of independently chosen R0's, so we
can't simply average numbers like sample^(1/10).)

Since variance of (R0)^10 is ~100^10 = 10 *billion*, it
could take O(10 billion) random samples to get a decent
approximation to the mean of (R0)^10.  I'd be willing
to bet that the Imperial model was not run 10 billion
times!

But this is merely one positive feedback loop in such
a Monte Carlo network simulation.  What happens when
there are multiple feedback loops ?  How many runs
might then be required ?

The problem here is that our samples have to explore
an incredibly wide and incredibly shallow distribution,
and then accumulate enough weight for each sample to
guarantee some reasonable accuracy for our result.  But
even if we performed such a computation, what would it
mean when the *variance* of the distribution is so wide
-- hence the weight of any particular value is so tiny
-- of what practical use is *any* particular value --
e.g., the "mean" ?

This is the reason why "R0" models make no sense in the
presence of superspreaders -- there is no single 'R0'
that captures any useful aspect of the behavior.

Henry Baker

tags

participants (1)