Re: [math-fun] The importance of being a coupon-collector

3 Feb 2016

      I'm just going to wing it here, but I think this is a problem that was important in W.W. I, where occasionally a German tank would be captured and its serial number noted.  Eventually this information was used to estimate how many tanks they had altogether.

(Here I use a well-known mathematician's trick dating back millennia: If you don't know the answer to a problem, solve a different one.)

Suppose we pick n random points {x_j} = x_1,...,x_n from an interval [0,1] in R.

Also, rename the points according to their order:

    x_(1) < x_(2) < ... < x_(n)

.

Therefore, the probability that the maximum point x_(n) on [0,1] is <= t (t in [0,1]) is given by t^n, the probability that all n points lie at or to the left of t.

So the density of the maximum x_(n) is

    d_max(t)  =  n t^(n-1),

so its expected value is

    E(x_(n))  =  Integral_{0<=t<=1} n t^n dt  

              =  n/(n+1)

. Symmetrically, the expected value of x_(1) = min{x_j} is 

    E(x_(1))  =  1/(n+1)

.  The probability that x_(1) >= t is given by (1-t)^n (all x_j are at least t). So Prob(x_(1) <= t)  is  1 - (1-t)^n and so its density is

(*)    d_min(t)  =  n (1-t)^(n-1).

The n+1 points consisting of the n random points plus the point 0 := (0 ~ 1) may be thought of as uniformly distributed on the resulting circle C = R/Z.

So the setup is the same as n+1 points uniformly distributed on C.

Therefore by symmetry we can conclude that the length of each interval between successive points

     x_(1) < x_(2) < ... < x_(n)

has the same density (*) as does x_(1)  =  x_(1) - 0.

Now suppose we don't know the length L of the original interval, which we assume to be [0,L].

(((
We want the joint distribution of x_(1) and x_(n) to infer the maximum likelihood value of 

     ML(n)  =  L/(x_(n) - x_(1)).

This can be used to infer L as

     L  = approx. ML(n)* (x_(n) - x_(1)).
)))

BUT: For now, back to the unit interval [0,1]:

Wikipedia states that the joint density 

of u = x_(1) and v = x_(n) is

        f(u,v)  =  (n! / (n-2)!) (v-u)^(n-2)

At this point I have to go; maybe more later.

—Dan
...
On Feb 3, 2016, at 9:13 AM, Fred Lunnon <fred.lunnon@gmail.com> wrote:
. . .
...
I don't know in advance how many distinct coupons are available,
 but have collected  m  among which there are  just  k  distinct.
...
What is the probability that  n  distinct coupons are available?
...
What is the most likely value of  n ?
...
What are asymptotic expressions for large  m ?

Re: [math-fun] The importance of being a coupon-collector

Dan Asimov