[math-fun] Zipf, Harmonic series & Egyptian fractions

15 Jun 2006

      Zipf's Law says that the probability of the n'th most popular thingy
is proportional to 1/n.  Clearly this probability exists only if the
universe of thingy's is finite.

I'm interested in efficient encodings of thingy's.  The Shannon-Fano
code says to split the thingy's into two groups having approximately
equal probabilities, assign on group an initial bit "0" and the other
group an inital bit "1", and then recurse within the groups.  This
means to me that one must express the fraction 1/2 as a sum of probabilities,
each of which is proportional to 1/k for some integer k.

Is there a general way to do this for arbitrary m, where m is the number
of thingy's ?  I.e., is there a general scheme for splitting the infinite
set {1/k} into two pieces that gives an optimal encoding for every m ?
If such a general scheme isn't exactly optimal, what about a scheme that
is within epsilon of being optimal ?

I presume that this question has already been studied, so I'm primarily
looking for a reference.

Thanks in advance for any help on this.

Henry Baker
hbaker1@pipeline.com

Henry Baker

Fred lunnon

franktaw＠netscape.net

tags

participants (3)