New subject: [math-fun] compressed bitmaps of the primes?

22 May 2012

      Very interesting!  Thanks!

I would imagine that the "zip"-type programs would do better with larger dictionaries -- especially on larger files -- so perhaps this would help to improve these numbers.  Perhaps there is an additional argument to these programs to allow for larger dictionaries?  (Many of these compression programs were developed in an earlier era of PC's having main memories in the megabyte range instead of the gigabyte range; so perhaps their idea of a "large" dictionary needs to be revised.))

I wonder if "zip"-type programs would do even as well as your 2-level scheme; i.e., could these programs notice the patterns of 0's when the number is divisible by 2, 3, 5 ?

At 11:35 AM 5/22/2012, Tom Rokicki wrote:
...
Well, I do have a sieve program that outputs bitmaps, but these bitmaps
represent a sequence of 30 integers with a single byte (the eight values
mod 30 that are not divisible by 2, 3, or 5.)
Running this to a 1GB data file (representing all primes less than
32,212,254,720) and compressing the resulting file, I see the following
sizes:
1073741824 primes.dat
798639547 primes.dat.bz2
761814389 primes.dat.gz
761814523 primes.zip
I'm a bit surprised the compressors don't do better; just using
arithmetic coding and the probability of a zero or one
bit should give about a 0.639 compression ratio, but the best
we see above is 0.709.
On Tue, May 22, 2012 at 11:00 AM, Henry Baker <hbaker1@pipeline.com> wrote:
...
Has anyone analyzed the expected size of bitmaps of the primes?
Specifically, consider the set of primes less than some number N, and then represent those primes as "1" bits in a bit vector of length N.
Then compress this bit vector using common techniques such as zip, gzip, 7zip, etc.
What is the growth rate of such compressed bitmaps as a function of N?  Are these schemes any better than a simple run-length encoding?
Has anyone even performed any experiments?
Thanks in advance for any info.
P.S.  Yes, I know, you can encode the entire infinite bitmap in a fixed/finite number of bits, where these bits encode a Turing Machine that spits out all of the primes.  That's not what I'm asking here.
-- 
--  http://cube20.org/  --  http://golly.sf.net/  --

Re: [math-fun] compressed bitmaps of the primes?

Henry Baker

John Aspinall

Tom Rokicki

Tom Rokicki

tags

participants (3)