Well, I do have a sieve program that outputs bitmaps, but these bitmaps represent a sequence of 30 integers with a single byte (the eight values mod 30 that are not divisible by 2, 3, or 5.) Running this to a 1GB data file (representing all primes less than 32,212,254,720) and compressing the resulting file, I see the following sizes: 1073741824 primes.dat 798639547 primes.dat.bz2 761814389 primes.dat.gz 761814523 primes.zip I'm a bit surprised the compressors don't do better; just using arithmetic coding and the probability of a zero or one bit should give about a 0.639 compression ratio, but the best we see above is 0.709. On Tue, May 22, 2012 at 11:00 AM, Henry Baker <hbaker1@pipeline.com> wrote:
Has anyone analyzed the expected size of bitmaps of the primes?
Specifically, consider the set of primes less than some number N, and then represent those primes as "1" bits in a bit vector of length N.
Then compress this bit vector using common techniques such as zip, gzip, 7zip, etc.
What is the growth rate of such compressed bitmaps as a function of N? Are these schemes any better than a simple run-length encoding?
Has anyone even performed any experiments?
Thanks in advance for any info.
P.S. Yes, I know, you can encode the entire infinite bitmap in a fixed/finite number of bits, where these bits encode a Turing Machine that spits out all of the primes. That's not what I'm asking here.
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
-- -- http://cube20.org/ -- http://golly.sf.net/ --