Re: [math-fun] secretary problem

20 Jun 2014

      Andy Latto wrote:
...
Suppose we take n independent samples from a normal distribution
(let's fix the distrubution as having mean 0 and variance 1). As n
increases, what happens to the expected value of (largest sample - 2nd
largest sample)?
How sensitive is this answer to the normality of the distribution?
What do we need to know about the distribution to conclude that as n
increases, this difference will behave the way it does for a normal
distribution?
Let {X[i], i = 1 to n} be a set of n independent identically distributed
random variables.  For each i, the cumulative distribution function is F(x)
= Prob[X[i] <= x], and (assuming F(x) is differentiable) the probability
density function is f(x) = F'(x).

Let X and Y be, respectively, the largest and second largest numbers in
this set.

Then the joint probability density function of X and Y is:

    f(x,y) = n(n-1) f(x) f(y) F(y)^(n-2) for x >= y
           = 0 for x < y

E[X-Y] = C n Integral(F(x)^(n-1) - F(x)^n, all x)

My calculus is rusty so I'm pretty sure I lost a constant factor
somewhere!  ... When the random variables are normally distributed my Monte
Carlo estimates agreed with this formula using C = Sqrt(2 Pi).

For the normal distribution, the formula E[X-Y] = K / (log(n)^q) works
pretty well for q just above 1/2.  But E[X-Y] actually shrinks a bit more
slowly than the right hand side.

Paul

Re: [math-fun] secretary problem

Paul R. Pudaite