The entertainment so far has been that I found a solution to a problem, despite being fuzzy about the problem, by working out how to measure solutions, and having a very good solution I didn't like (unary, egads, pfeh) fall out of that. I want to say a few things about how to solve the problem that I learned after having solved it.* The length of a codeword is supposed to match the log of the probability of the corresponding event. That way, you minimize the *expected* length of messages. But this problem wasn't about the expected length, it was about minimizing the *maximum* length. So I felt outside my usual habits. But there are two ways to push the problem back into the probability model. One is, to realize that if all messages have the *same* length, then that's the maximum. More, if that's the least possible single length, then it's also the least possible maximum. But inventing wholistic codes for arrays is tricky, so I was trying to code individual numbers, which seems reasonable because the problem seems...homogeneous across the array. But, thinking about the *typical* distribution seemed unwise for this minimum-maximum problem. Second, the right analog to probability is that every distinct array has the same probability, just the inverse of the number of distinct arrays. In this case it was the number of choices, duplicates allowed, of r things (in this case deltas between entries in a sorted list) taken from n. With my example where r = 1e6 and n = 2^32, ( r + n - 1 )! 13,511,284 -------------- ~= 2 (1) r! ( n - 1 )! The probability of a value for one of the elements in the array is the fraction of possible arrays where that element has that value. If each element is encoded according to that probability, the message will have the right length. From (1) above, you can calculate the probability of a delta d. It's the ratio of the number of arrays after removing d from the problem, to the number before. The number r decreases by one for each delta, and the number n decreases by the size of the delta. ( r - 1 + n - 1 - d )! r! ( n - 1 )! ----------------------------------------- (2) ( r - 1 )! ( n - 1 - d)! ( r + n - 1 )! Canceling, factoring and rearranging a little... r ( r + n - 2 - d )! ( n - 1 )! ------------- --------------------------------- (3) ( r + n - 1 ) ( r + n - 2 )! ( n - 1 - d)! When d = 0, the first factor determines the minimum codeword size. For my example, about -log2( 1e6 / 2^32 ) ~= 12.07, where my code used a minimum of 13 bits. Since the cost is the log of (3), the costs of factors just add. The second factor is roughly d ( n - 1 - d/2 ) ( --------------- ) (4) ( r + n - 2 - d/2 ) So, very roughly, the cost of d is -log2( n/(r+n) ) * d. For the example that's -log2( 2^32 / (2^32 + 1e6) ) * d ~= d/2977 (vs. my code's d/4096). This is what dictates the unary part of the code. (The two costs turn out pretty close for the total d = n = 2^32: 1e6 * 12.07 + 2^32 / 2977 ~= 13,512,716 ** 1e6 * 13 + 2^32 / 4096 = 14,048,576 (5) ) What I've been calculating is the cost of the first delta in the list. But the numbers r and n tend to shrink proportionally as you go through the list, leaving the results the same--mostly & usually! But you can compare the number in (5) to the exponent of (1) as a final check. The code is a Golumb/Rice code, which implies that deltas in a sorted list have a geometric distribution, which I'm trying to rebend my intuition around. --Steve * "You can't fix a Lisp Machine by just power-cycling it with no understanding of what's going on!" ** Using a better estimate of the cost of the average first delta, -r*log2(r/(r+n-1))-n*log2((n-1-(n/r/2))/(n+r-2-(n/r/2))) ~= 13,511,294 -- off by only 10 bits.