Correct. There's an interesting paper that draws an analogy between prime numbers and software libraries, at least as far as their statistics are concerned: "Software Libraries & Their Reuse" by Todd Veldhuizen (use Google search to find). His observations: 1. Infinitely many primes & infinitely many SW components. 2. nth prime is factor of ~1/(n ln(n)) of the integers. nth most frequently used library component is ~1/(n log(n) log+(n)). 3. Erdos-Kac predicts # factors tends to normal distribution; similar measurements indicate the same for SW components. 4. Prime Number theorem nth prime is ~log(n ln(n)) bits long; approx. ditto for SW components. Do primes have an approx. Zipf-type distribution? Alternately, should Zipf's Law be reformulated to look more like the prime distribution? -- i.e., perhaps word frequency data would match prime distribution data better than the harmonic distribution? At 07:19 PM 6/15/2006, dasimov@earthlink.net wrote:
<< Zipf's Law says that the probability of the n'th most popular thingy is proportional to 1/n. Clearly this probability exists only if the universe of thingy's is finite.
Actually Zipf's law applies to the commest *words* used, and assuming we're looking at N of them gives the k'th one a probability of
(1/k) / (sum{j=1 to N} (1/j)),
so it's normalized to 1.
People have looked at a generalized Zipf's law, which allows a fixed exponent where you replace all the 1/C 's with 1/C^t 's -- which gives a good fit for things other than words with s <> 1.
--Dan