Hi Victor: Yes, the Elias omega coding is interesting, but it's still going in the exactly wrong direction because a single-bit error can produce catastrophic results in the value decoded. Indeed, any of the usual "lossless compression" algorithms assume that *every individual bit* is accurate, with no errors possible. This is what is called the "digital cliff effect", whereby a cellphone or a digital TV which reaches the edge of its range suddenly finds *no signal at all*, rather than a smoothly degrading signal a la "analog TV" or "analog radio". The only way I can think of to encode integer N in a range [0,M^k) for some "large" M>0 is to utilize a sort of unary code for m bits, followed by a binary code for the remainder mod(N,M), in a manner very Golumb-like. The unary code has a fixed length dependent upon k, and decoding is easy: HammingWeight(unary code portion)*M+Binarycode(remainderbits) Thus a single-bit error in the unary part can cost at most M (out of M^k), and a single-bit error in the remainder costs at most M/2. Thus, no single bit can cause a catastrophic loss. At 09:27 AM 1/26/2019, Victor Miller wrote:
https://en.m.wikipedia.org/wiki/Elias_omega_coding
On Sat, Jan 26, 2019 at 10:32 Henry Baker <hbaker1@pipeline.com> wrote:
I just found that the canonical name for this type of coding is called "Golomb coding":
We choose a suitable a priori fixed M based upon the (fixed) statistics of the integer N being encoded.
Compute (q,r) = (floor(N/M),mod(N,M)).
Then the code for N is unary(q)|binary(r).
More accurately, we actually encode N-1:
(q,r) = (floor((N-1)/M),mod(N-1,M))
code is unary(q)|1|binary(r).
https://en.wikipedia.org/wiki/Golomb_coding
Here's a page with worked-out examples:
https://w3.ual.es/~vruiz/Docencia/Apuntes/Coding/Text/03-symbol_encoding/09-...
Once again, I claim that the Roman numeral requirement for unary coding of thousands, e.g., "MMMMMM" for 6,000, was an *advantage*, since the mere size of the representation would more quickly distinguish it from the representation for 3,000. The Roman numeral system *approximated Golumb coding* thousands of years before Golomb.
https://en.wikipedia.org/wiki/Roman_numerals
At 02:33 AM 1/26/2019, Dave Dyer wrote:
Assuming I had a large number of integers to protect, my thought would be to feed them through a good compressor, maybe LZW, and then apply ECC to the resulting high entropy strings. The initial representation of the integers wouldn't matter much.