Re: [math-fun] "memoization"; considered harmful
Cris Moore <moore@santafe.edu> wrote: Of course, "memoization" is a silly neologism, redolent of 1950's office work ;-) Only for the elderly. when I teach this subject, I use the perfectly good word "memorization". Indeed. "Memoization" has never achieved full cromulence, in my book. I've never had a problem with just "caching" results. Phil [Carmody] --------- I agree it's ugly. I disagree that it's superfluous. Foisting its meaning onto "memorizing" and "caching" overloads them and deprives them of connotation, so I fear we're stuck with "memoizing". Which when typed into Mathematica's Help dialog will soon find you tutorial/FunctionsThatRememberValuesTheyHaveFound [...] "There is of course a trade-off involved in remembering values. It is faster to find a particular value, but it takes more memory space to store all of them. You should usually define functions to remember values only if the total number of different values that will be produced is comparatively small, or the expense of recomputing them is very great." This is an understatement on a par with "Caution: Cigarette Smoking May be Hazardous to Your Health" (1966–1970). The performance degradation increases steeply and linearly as the cache grows, rapidly becoming prohibitive. E.g., the exact Hilbert spacefilling function Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]]) actually requires memoizing to find the fixed point of infinite recursions: In[280]:= Hilbert/@Range[0, 1, 1/6] Out[280]= {0, 1/2 + I/2, I, 1/2 + I/2, 1 + I, 1/2 + I/2, 1} Note how 1/6, 1/2, and 5/6 all map to the center of the target square [0,1]×[0,i]. So we can draw an "nth order Hilbert curve" as a polygon joining the centers of the 4^n subsquares of [0,1]×[0,i], whose inverse images are {c,c+1,c+2,...,c+4^n-1}/4^n, for c=1/6, 1/2, or 5/6. (3^4^n choices.) For a pretty but traditional picture, (choosing c=1/2) try Timing[Graphics[ Polygon[{Re[#], Im[#]} & /@ Join[{1, 0}, Hilbert /@ Range[4^-3/2, 1, 4^-3]]]]] {0.027261,...} Immediately rerunning this gave a timing of 0.004007, thanks to memoizing. This choice c=1/2 restricts the calculation to dyadic rationals, permitting a much simpler, non-memoizing, non-fixed-pointing Hilbert with termination conditions just for 0 and 1. But what's not to like? Memoizing is working fine. (Digression, for a non-traditional, non-dyadic-rational plot that really needs memoizing, try, e.g., Graphics[Polygon[{Re[#],Im[#]}&/@Append[Hilbert/@Range[4^-6,1,5/2^11],1]]]) But when Neil tried an 8th order dyadic curve, it would not finish. Even after supper. Clearing and redefining Hilbert, a mere 6th order curve took In[25]:= Timing[Do[Hilbert[k], {k, 4^-6/2, 1, 4^-6}]] Out[25]= {106.747152, Null} That's the same number of points (4^6) as one quarter of a 7th order: In[27]:= Timing[Do[Hilbert[k], {k, 4^-7/2, 1/4, 4^-7}]] Out[27]= {336.662000, Null} which takes > thrice as long! Since Neil only needed the c=1/2 dyadic case, he disgustedly wrote a finite state machine (two bits of t in, one bit each of x and y out). But let us grimly proceed. The 2nd quadrant of 7th order, In[32]:= Timing[Do[Hilbert[k], {k, 1/4 - 4^-7/2, 1/2, 4^-7}]] Out[32]= {394.267917, Null} is even slower, but should have been faster due to cached values from the first quadrant! Try something drastic--clear the cache and redefine the whole function for every point: In[33]:= Timing[ Do[Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]]); Hilbert[k], {k, 4^-7/2, 1, 4^-7}]] Out[33]= {18.388266, Null} for all four quadrants! I can't believe WRI can't easily, dramatically remediate this. Neil made some damning running-time plots which I hope he will blog soon. --rwg (Saving and resuming this draft introduced many redundant line breaks. Apologies if GMail has undeleted them.)
The overloading of the word "memorize" reminds me of an incident from about fifteen years ago, when I was having trouble setting up a new TV (the kind that supports a hundred channels even though only a fraction of them are in use in any given area) and I called the manufacturer's customer support line. The technician I spoke to insisted that before I could use the TV, I first needed to "memorize all the channels" that were broadcasting in my area, of which there were a dozen. I was incredulous and indignant ("Why should I have to memorize them? Shouldn't it be my TV's job to know what they are?") until it became clear that what she meant by "memorize" was program the channels into the TV. Jim Propp On Saturday, May 4, 2013, Bill Gosper <billgosper@gmail.com> wrote:
Cris Moore <moore@santafe.edu> wrote:
Of course, "memoization" is a silly neologism, redolent of 1950's office work ;-)
Only for the elderly.
when I teach this subject, I use the perfectly good word "memorization".
Indeed. "Memoization" has never achieved full cromulence, in my book. I've never had a problem with just "caching" results.
Phil [Carmody]
--------- I agree it's ugly. I disagree that it's superfluous. Foisting its meaning onto "memorizing" and "caching" overloads them and deprives them of connotation, so I fear we're stuck with "memoizing". Which when typed into Mathematica's Help dialog will soon find you
tutorial/FunctionsThatRememberValuesTheyHaveFound [...] "There is of course a trade-off involved in remembering values.
It is faster to find a particular value, but it takes more memory space to store all of them. You should usually define functions to remember values only if the total number of different values that will be produced is comparatively small, or the expense of recomputing them is very great."
This is an understatement on a par with
"Caution: Cigarette Smoking May be Hazardous to Your Health" (1966–1970).
The performance degradation increases steeply and linearly as the cache grows, rapidly becoming prohibitive.
E.g., the exact Hilbert spacefilling function Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]])
actually requires memoizing to find the fixed point of infinite recursions:
In[280]:= Hilbert/@Range[0, 1, 1/6]
Out[280]= {0, 1/2 + I/2, I, 1/2 + I/2, 1 + I, 1/2 + I/2, 1}
Note how 1/6, 1/2, and 5/6 all map to the center of the target square [0,1]×[0,i].
So we can draw an "nth order Hilbert curve" as a polygon joining the centers of the 4^n subsquares of [0,1]×[0,i], whose inverse images are {c,c+1,c+2,...,c+4^n-1}/4^n, for c=1/6, 1/2, or 5/6. (3^4^n choices.) For a pretty but traditional picture, (choosing c=1/2) try
Timing[Graphics[ Polygon[{Re[#], Im[#]} & /@ Join[{1, 0}, Hilbert /@ Range[4^-3/2, 1, 4^-3]]]]]
{0.027261,...}
Immediately rerunning this gave a timing of 0.004007, thanks to memoizing. This choice c=1/2 restricts the calculation to dyadic rationals, permitting a much simpler, non-memoizing, non-fixed-pointing Hilbert with termination conditions just for 0 and 1. But what's not to like? Memoizing is working fine. (Digression, for a non-traditional, non-dyadic-rational plot that really needs memoizing, try, e.g.,
Graphics[Polygon[{Re[#],Im[#]}&/@Append[Hilbert/@Range[4^-6,1,5/2^11],1]]])
But when Neil tried an 8th order dyadic curve, it would not finish. Even after supper. Clearing and redefining Hilbert, a mere 6th order curve took In[25]:= Timing[Do[Hilbert[k], {k, 4^-6/2, 1, 4^-6}]]
Out[25]= {106.747152, Null}
That's the same number of points (4^6) as one quarter of a 7th order:
In[27]:= Timing[Do[Hilbert[k], {k, 4^-7/2, 1/4, 4^-7}]]
Out[27]= {336.662000, Null}
which takes > thrice as long! Since Neil only needed the c=1/2 dyadic case, he disgustedly wrote a finite state machine (two bits of t in, one bit each of x and y out). But let us grimly proceed. The 2nd quadrant of 7th order,
In[32]:= Timing[Do[Hilbert[k], {k, 1/4 - 4^-7/2, 1/2, 4^-7}]]
Out[32]= {394.267917, Null}
is even slower, but should have been faster due to cached values from the first quadrant!
Try something drastic--clear the cache and redefine the whole function for every point:
In[33]:= Timing[ Do[Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]]); Hilbert[k], {k, 4^-7/2, 1, 4^-7}]]
Out[33]= {18.388266, Null} for all four quadrants!
I can't believe WRI can't easily, dramatically remediate this. Neil made some damning running-time plots which I hope he will blog soon. --rwg (Saving and resuming this draft introduced many redundant line breaks. Apologies if GMail has undeleted them.) _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
This is an example of "extended cognition" :-) most of my memory now resides in my laptop. Cris On May 4, 2013, at 10:43 PM, James Propp wrote:
The technician I spoke to insisted that before I could use the TV, I first needed to "memorize all the channels" that were broadcasting in my area, of which there were a dozen. I was incredulous and indignant ("Why should I have to memorize them? Shouldn't it be my TV's job to know what they are?") until it became clear that what she meant by "memorize" was program the channels into the TV.
I was incredibly blown away by Gosper's weird use of memoization. I've never seen anything like that before. I coded an implementation in Lisp after I spent a while trying to grok the Mathematica code. This code is not optimized whatsoever. Not even optimization declaration! Here are my benchmarks assuming my code is correct. :) This is being run on a mangy Mac Book Air (not a Pro!). I include two separate numbers. The first time is the time when the cache is cleared before the benchmark, the second number denoted by NO-CLEAR, is a persistent cache across all benchmarks. (So I actually ran 8 benchmarks in 2 runs.) FIRST EXAMPLE: 0 to 1 by 1/6
(time (mapcar #'hilbert (range 0 1 1/6))) Timing the evaluation of (MAPCAR (FUNCTION HILBERT) (RANGE 0 1 1/6))
User time = 0.000 (NO-CLEAR: 0.00) System time = 0.000 Elapsed time = 0.000 Allocation = 8736 bytes 0 Page faults (0 #C(1/2 1/2) #C(0 1) #C(1/2 1/2) #C(1 1) #C(1/2 1/2) 1) SECOND EXAMPLE: 4^-6/2 to 1 by 4^-6
(time (bench 6 1)) Timing the evaluation of (BENCH 6 1)
User time = 0.068 (NO-CLEAR: 0.013) System time = 0.000 Elapsed time = 0.068 Allocation = 18878392 bytes 5 Page faults 4096 THIRD EXAMPLE: 4^-7/2 to 1/4 by 4^-7
(time (bench 7 1/4)) Timing the evaluation of (BENCH 7 1/4)
User time = 0.075 (NO-CLEAR: 0.014) System time = 0.000 Elapsed time = 0.077 Allocation = 21321040 bytes 4 Page faults 4096 FOURTH EXAMPLE: 1/4 - 4^-7/2 to 1/2 by 4^-7
(time (bench2 7)) Timing the evaluation of (BENCH2 7)
User time = 0.086 (NO-CLEAR: 0.009) System time = 0.000 Elapsed time = 0.087 Allocation = 20872456 bytes 6 Page faults 4097 Much better than Mathematica. :) Cheers, Robert Smith You're free to do as you wish with the following code pasted below. I also keep a permanent record of it in the following repository, which also has the benefit of being updated: https://bitbucket.org/tarballs_are_good/lisp-random/src/f2d5fa6dbb366f720ed7... --- SOURCE CODE --- (defconstant I #C(0 1)) (defconstant 1+I #C(1 1)) (declaim (inline half)) (defun half (x) (/ x 2)) (defparameter *cache* (make-hash-table :test 'equalp)) (defun clear-cache () (setf *cache* (make-hash-table :test 'equalp))) (defun hilbert-aux (tt a1 a0) (labels ((hilbert-core (n t4) (cond ((= n 0) (* I (half (- 1 (hilbert-aux (- 1 t4) (- (* I (half a1))) (+ a0 (half (* a1 I)))))))) ((= n 1) (half (+ I (hilbert-aux t4 (half a1) (+ a0 (half (* a1 I))))))) ((= n 2) (half (+ 1+I (hilbert-aux t4 (half a1) (+ a0 (half (* a1 1+I))))))) ((= n 3) (1+ (half (* I (hilbert-aux (- 1 t4) (half (* a1 I)) (+ a0 a1)))))) ((= n 4) 1)))) (let ((res (gethash tt *cache*))) (if res (funcall res a1 a0) (progn (setf (gethash tt *cache*) (lambda (s1 s0) (/ (- a0 s0) (- s1 a1)))) (let* ((t4 (* 4 tt)) (n (floor t4)) (t4-n (- t4 n)) (new (hilbert-core n t4-n))) (setf (gethash tt *cache*) (lambda (b1 b0) (declare (ignore b1 b0)) new)) new)))))) (defun hilbert (x) (clear-cache) (hilbert-aux x 1 0)) --- BENCHMARK CODE --- (defun range (start end step) (loop :for x :from start :to end :by step :collect x)) (defun bench (n &optional (upper 1)) (loop :with bound := (/ (expt 4 n)) :for k :from (half bound) :to upper :by bound :count (hilbert k))) (defun bench2 (n &optional (upper 1/2)) (loop :with bound := (/ (expt 4 n)) :for k :from (- 1/4 (half bound)) :to upper :by bound :count (hilbert k))) On Sat, May 4, 2013 at 4:54 PM, Bill Gosper <billgosper@gmail.com> wrote:
Cris Moore <moore@santafe.edu> wrote:
Of course, "memoization" is a silly neologism, redolent of 1950's office work ;-)
Only for the elderly.
when I teach this subject, I use the perfectly good word "memorization".
Indeed. "Memoization" has never achieved full cromulence, in my book. I've never had a problem with just "caching" results.
Phil [Carmody]
--------- I agree it's ugly. I disagree that it's superfluous. Foisting its meaning onto "memorizing" and "caching" overloads them and deprives them of connotation, so I fear we're stuck with "memoizing". Which when typed into Mathematica's Help dialog will soon find you
tutorial/FunctionsThatRememberValuesTheyHaveFound [...] "There is of course a trade-off involved in remembering values.
It is faster to find a particular value, but it takes more memory space to store all of them. You should usually define functions to remember values only if the total number of different values that will be produced is comparatively small, or the expense of recomputing them is very great."
This is an understatement on a par with
"Caution: Cigarette Smoking May be Hazardous to Your Health" (1966–1970).
The performance degradation increases steeply and linearly as the cache grows, rapidly becoming prohibitive.
E.g., the exact Hilbert spacefilling function Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]])
actually requires memoizing to find the fixed point of infinite recursions:
In[280]:= Hilbert/@Range[0, 1, 1/6]
Out[280]= {0, 1/2 + I/2, I, 1/2 + I/2, 1 + I, 1/2 + I/2, 1}
Note how 1/6, 1/2, and 5/6 all map to the center of the target square [0,1]×[0,i].
So we can draw an "nth order Hilbert curve" as a polygon joining the centers of the 4^n subsquares of [0,1]×[0,i], whose inverse images are {c,c+1,c+2,...,c+4^n-1}/4^n, for c=1/6, 1/2, or 5/6. (3^4^n choices.) For a pretty but traditional picture, (choosing c=1/2) try
Timing[Graphics[ Polygon[{Re[#], Im[#]} & /@ Join[{1, 0}, Hilbert /@ Range[4^-3/2, 1, 4^-3]]]]]
{0.027261,...}
Immediately rerunning this gave a timing of 0.004007, thanks to memoizing. This choice c=1/2 restricts the calculation to dyadic rationals, permitting a much simpler, non-memoizing, non-fixed-pointing Hilbert with termination conditions just for 0 and 1. But what's not to like? Memoizing is working fine. (Digression, for a non-traditional, non-dyadic-rational plot that really needs memoizing, try, e.g., Graphics[Polygon[{Re[#],Im[#]}&/@Append[Hilbert/@Range[4^-6,1,5/2^11],1]]])
But when Neil tried an 8th order dyadic curve, it would not finish. Even after supper. Clearing and redefining Hilbert, a mere 6th order curve took In[25]:= Timing[Do[Hilbert[k], {k, 4^-6/2, 1, 4^-6}]]
Out[25]= {106.747152, Null}
That's the same number of points (4^6) as one quarter of a 7th order:
In[27]:= Timing[Do[Hilbert[k], {k, 4^-7/2, 1/4, 4^-7}]]
Out[27]= {336.662000, Null}
which takes > thrice as long! Since Neil only needed the c=1/2 dyadic case, he disgustedly wrote a finite state machine (two bits of t in, one bit each of x and y out). But let us grimly proceed. The 2nd quadrant of 7th order,
In[32]:= Timing[Do[Hilbert[k], {k, 1/4 - 4^-7/2, 1/2, 4^-7}]]
Out[32]= {394.267917, Null}
is even slower, but should have been faster due to cached values from the first quadrant!
Try something drastic--clear the cache and redefine the whole function for every point:
In[33]:= Timing[ Do[Clear[Hilbert]; Hilbert[t_, a1_: 1, a0_: 0] := Hilbert[t, b1_: 1, b0_: 0] = (Hilbert[t, s1_: 1, s0_: 0] = ((a0 - s0)/(s1 - a1)); Module[{t4 = 4*t, n}, n = Floor[t4]; t4 -= n; Switch[n, 0, I*(1 - Hilbert[1 - t4, -I*a1/2, a0 + a1*I/2])/2, 1, (I + Hilbert[t4, a1/2, a0 + a1*I/2])/2, 2, (1 + I + Hilbert[t4, a1/2, a0 + a1*(1 + I)/2])/2, 3, 1 + I*Hilbert[1 - t4, a1*I/2, a0 + a1]/2, 4, 1]]); Hilbert[k], {k, 4^-7/2, 1, 4^-7}]]
Out[33]= {18.388266, Null} for all four quadrants!
I can't believe WRI can't easily, dramatically remediate this. Neil made some damning running-time plots which I hope he will blog soon. --rwg (Saving and resuming this draft introduced many redundant line breaks. Apologies if GMail has undeleted them.) _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (4)
-
Bill Gosper -
Cris Moore -
James Propp -
quad