Re: [math-fun] The importance of being a coupon-collector

3 Feb 2016


      https://en.wikipedia.org/wiki/German_tank_problem

On Wed, Feb 3, 2016 at 3:46 PM, Fred Lunnon <fred.lunnon@gmail.com> wrote:
...
I had heard Dan's serial-number anecdote as well, though my informant claimed
it related to rockets.  [ I did often wonder how one retrieved such
data from the
exploded fragments, but supposed the missiles might often fail to
detonate ... ]
And I had that in mind when posing this problem.  But here we are sampling
with replacement; and no distance measure seems available for coupons!
Is the problem well posed, doubts Alan.  Well, the expected number of trials
to acquire all  n  coupons asymptotically equals  n log n  --- a result which,
while still a student, I succeeded in solving at the cost of several pages of
convoluted algebra, which I later proudly showed to an older colleague.
Who proceeded to derive the same result in a single line.  A similar fate
has overtaken most of my forays into probability.
Ahem, I digress.  Now I argue that if a trial reaches reaches  k  distinct
coupons in  m > c k log k  trials, where (say)  c = 2, or 10, then
it's a damned
good bet that  n = k .  Anybody disagree?
Fred Lunnon
On 2/3/16, Dan Asimov <dasimov@earthlink.net> wrote:
...
I'm just going to wing it here, but I think this is a problem that was
important in W.W. I, where occasionally a German tank would be captured and
its serial number noted.  Eventually this information was used to estimate
how many tanks they had altogether.
(Here I use a well-known mathematician's trick dating back millennia: If you
don't know the answer to a problem, solve a different one.)
Suppose we pick n random points {x_j} = x_1,...,x_n from an interval [0,1]
in R.
Also, rename the points according to their order:
x_(1) < x_(2) < ... < x_(n)
.
Therefore, the probability that the maximum point x_(n) on [0,1] is <= t (t
in [0,1]) is given by t^n, the probability that all n points lie at or to
the left of t.
So the density of the maximum x_(n) is
d_max(t)  =  n t^(n-1),
so its expected value is
E(x_(n))  =  Integral_{0<=t<=1} n t^n dt
=  n/(n+1)
. Symmetrically, the expected value of x_(1) = min{x_j} is
E(x_(1))  =  1/(n+1)
.  The probability that x_(1) >= t is given by (1-t)^n (all x_j are at least
t). So Prob(x_(1) <= t)  is  1 - (1-t)^n and so its density is
(*)    d_min(t)  =  n (1-t)^(n-1).
The n+1 points consisting of the n random points plus the point 0 := (0 ~ 1)
may be thought of as uniformly distributed on the resulting circle C = R/Z.
So the setup is the same as n+1 points uniformly distributed on C.
Therefore by symmetry we can conclude that the length of each interval
between successive points
x_(1) < x_(2) < ... < x_(n)
has the same density (*) as does x_(1)  =  x_(1) - 0.
Now suppose we don't know the length L of the original interval, which we
assume to be [0,L].
(((
We want the joint distribution of x_(1) and x_(n) to infer the maximum
likelihood value of
ML(n)  =  L/(x_(n) - x_(1)).
This can be used to infer L as
L  = approx. ML(n)* (x_(n) - x_(1)).
)))
BUT: For now, back to the unit interval [0,1]:
Wikipedia states that the joint density
of u = x_(1) and v = x_(n) is
f(u,v)  =  (n! / (n-2)!) (v-u)^(n-2)
At this point I have to go; maybe more later.
—Dan
...
On Feb 3, 2016, at 9:13 AM, Fred Lunnon <fred.lunnon@gmail.com> wrote:
. . .
...
I don't know in advance how many distinct coupons are available,
 but have collected  m  among which there are  just  k  distinct.
...
What is the probability that  n  distinct coupons are available?
...
What is the most likely value of  n ?
...
What are asymptotic expressions for large  m ?
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- 
Mike Stay - metaweta@gmail.com
http://www.cs.auckland.ac.nz/~mike
http://reperiendi.wordpress.com