Re: [math-fun] Zipf, Harmonic series & Egyptian fractions

16 Jun 2006

      Correct.

There's an interesting paper that draws an analogy between prime numbers
and software libraries, at least as far as their statistics are concerned:
"Software Libraries & Their Reuse" by Todd Veldhuizen (use Google search
to find).

His observations:

1.  Infinitely many primes & infinitely many SW components.
2.  nth prime is factor of ~1/(n ln(n)) of the integers.  nth
most frequently used library component is ~1/(n log(n) log+(n)).
3.  Erdos-Kac predicts # factors tends to normal distribution;
similar measurements indicate the same for SW components.
4.  Prime Number theorem nth prime is ~log(n ln(n)) bits long;
approx. ditto for SW components.

Do primes have an approx. Zipf-type distribution?

Alternately, should Zipf's Law be reformulated to look more like
the prime distribution? -- i.e., perhaps word frequency data would
match prime distribution data better than the harmonic distribution?

At 07:19 PM 6/15/2006, dasimov@earthlink.net wrote:
...
<<
Zipf's Law says that the probability of the n'th most popular thingy
is proportional to 1/n.  Clearly this probability exists only if the
universe of thingy's is finite.
...
...
Actually Zipf's law applies to the commest *words* used, and assuming we're
looking at N of them gives the k'th one a probability of
(1/k) / (sum{j=1 to N} (1/j)),
so it's normalized to 1.
People have looked at a generalized Zipf's law, which allows a fixed exponent
where you replace all the 1/C 's with 1/C^t 's -- which gives a good fit for things
other than words with s <> 1.
--Dan

Re: [math-fun] Zipf, Harmonic series & Egyptian fractions

Henry Baker