3 Aug
2004
3 Aug
'04
2:15 p.m.
Michael Kleber writes:
On a related note, suppose I have a bin of N distinct things, but I don't know N. I'm allowed to choose-with-replacement k times, and I can recognize things I've seen before, so that I can build a histogram of how many things I saw t times, for t=1,2,3,... How do I best guess N?
This appears to be a variant of the coupon collector's problem, as discussed in http://www.math.uci.edu/~mfinkels/COUPON.PDF . They give a maximum likelihood estimate of N as the smallest integer j >= C_k satisfying j + 1 ( j )^k ----------- (-----) < 1 j + 1 - C_k ( j+1 ) where C_k is the number of distinct things you saw in your k samples. -Thomas C