http://pdp-10.trailing-edge.com/decuslib10-10/01/43,50516/basxct.mac.html ;VERSION 17E 2-OCT-74/NA ;VERSION 17D 4-MAY-73/KK ;VERSION 17C 2-JAN-73/KK ;VERSION 17B 25-JUL-72/KK ;VERSION 17A 10-FEB-1972/KK ;VERSION 17 15-OCT-1971/KK ;VERSION 16 5-APR-1971/KK ;VERSION 15 17-AUG-1970/KK ;VERSION 14 16-JUL-1970/AL/KK ;VERSION 13 15-SEP-1969 ... ;FLOATING POINT SINGLE PRECISION SQUARE ROOT FUNCTION ;THE SQUARE ROOT OF THE ABSOLUTE VALUE OF THE ARGUMENT IS ;CALCULATED. THE ARGUMENT IS WRITTEN IN THE FORM ; X= F*(2**2B) WHERE 0.LT.F.LT.1 ;SQRT(X) IS THEN CALCULATED AS (SQRT(X))*(2**B) ;SQRT(F) IS CALCULATED BY N LINEAR APPROXIMATION, THE NATURE ;OF WHICH DEPENDS ON WHETHER 1/4 .LT. F .LT. 1/2 OR 1/2 .LT. F .LT. 1, ;FOLLOWED BY TWO ITERATIONS OF NEWTON'S METHOD. SQRTB: MOVE T, N ;PICK UP THE ARGUMENT IN T JUMPL T,SQRMIN ;SQRT OF NEGATIVE NUMBER? JUMPE T,SQRT1 ;CHECK FOR ARGUMENT OF ZERO SQRTB0: ASHC T, -33 ;PUT EXPONENT IN T, FRACTION IN T1 SUBI T, 201 ;SUBTRACT 201 FROM EXPONENT ROT T, -1 ;CUT EXP IN HALF, SAVE ODD BIT HRRM T,EX1 ;SAVE FOR FUTURE SCALING OF ANS ;IN FSC N,. INSTRUCTION LSH T, -43 ;GET BIT SAVED BY PREVIOUS INST. ASH T1, -10 ;PUT FRACTION IN PROPER POSITION FSC T1, 177(T) ;PUT EXPONENT OF FRACT TO -1 OR 0 MOVEM T1, N ;SAVE IT. 1/4 .LT. F .LT. 1 FMP T1, SQCON1(T) ;LINEAR FIRST APPROX,DEPENDS ON FAD T1, SQCON2(T) ;WHETHER 1/4.LT.F.LT.1/2 OR 1/2.LT.F.LT.1. MOVE T, N ;START NEWTONS METHOD WITH FRAC FDV T, T1 ;CALCULATE X(0)/X(1) FAD T1, T ;X(1) + X(0)/X(1) FSC T1, -1 ;1/2(X(1) + X(0)/X(1)) FDV N, T1 ;X(0)/X(2) FADR N, T1 ;X(2) + X(0)/X(2) XCT EX1 SQRT1: POPJ P, ;EXIT SQCON1: 0.8125 ;CONSTANT, USED IF 1/4.LT.FRAC.LT.1/2 0.578125 ;CONSTANT, USED IF 1/2.LT.FRAC.LT.1 SQCON2: 0.302734 ;CONSTANT, USED IF 1/4.LT.FRAC.LT.1/2 0.421875 ;CONSTANT, USED IF 1/2.LT.FRAC.LT.1 SQRMIN: PUSH P,T ;SAVE ARG NFERR (54,0) PUSHJ P,INLMES ASCIZ / % SQRT of negative number/ PUSHJ P,GOSR3 ;PRINT LINE NUMBER POP P,T ;GET ARG MOVMS T JRST SQRTB0 ;USE ABSOLUTE VALUE At 04:52 PM 4/15/2011, rcs@xmission.com wrote:
I think I remember seeing this in the PDP6 library function for sqrt. I believe the shift was followed by adding (integer add) a magic constant, probably to adjust the range and repair the exponent offset. Then a few rounds of Newton, unwound.
(note to folks under 50) Before IEEE floating point came along, the individual computer makers had their own formats for floating point. Both the 7090 series and PDP6/10 used 36-bit words. The PDP6 format had 8 bits of exponent, offset by 128, and 27 bits of mantissa. A normalized number included the high-order 1 bit, which is dropped in IEEE format. I don't recall if that bit had value .5 or 1.0. 0.0 was simply 0, and negatives were the (integer) two's complement of the positives. This had the virtue that the ordering of (normalized) floating numbers was the same as the ordering of integers, so the compare instructions didn't need to come in two flavors.
I think I'll register a disagreement with the comment that IEEE accuracy requirements for sqrt (exactly correct rounding) effectively preclude doing it in software: After a few rounds of Newton, the answer should be within one bit, likely on the high side. You can check which of two consecutive floating values is correct by comparing V * (V-epsilon) versus X.
Rich
---- Quoting Bill Gosper <billgosper@gmail.com>:
hgb>When making an initial guess of sqrt(x) for a Newton iteration, where x is a positive floating point number, a good initial guess is the bits of the entire number (both exponent & mantissa) shifted right by 1.
This is the sort of thing that would have been done on the 7090 or the PDP-10. Does anyone here recall seeing this trick in the 1950's or 1960's ?
I checked, and it isn't in HAKMEM.
I've used this trick myself (at least on machines where moving numbers between the integer & floating point units isn't prohibitive). ------ rwg>It's really old, and can probably be found in the 704 and pdp-6 library sqrts. I think Knuth told me he found it as an undergrad, but that basketball youtube shows him using a biquinary machine (650?). --rwg
Henry, Even without a list of PDP-10 opcodes, this clearly doesn't fulfill your original question because it is far too many instructions. Also, it seems to address a data (constant) area differently based on the low bit of the (pre-shifted) exponent. The whole point of your original question is that everything is done by shifting the exponent and mantissa together as a monolithic object. You wrote:
When making an initial guess of sqrt(x) for a Newton iteration, where x is a positive floating point number, a good initial guess is the bits of the entire number (both exponent & mantissa) shifted right by 1. [...]
Given the "0x5f3759df" a.k.a. "Fast InvSqrt()" example, it makes sense to allow one constant to both correct the exponent offset and bias the range of the linear function (to bring its average closer to the parabola), but this code has 4 constants! Similar comments would apply to that other PDP examples too: way too many constants. Relevant parts with Fortran ".LT." changed to < for clarity:
http://pdp-10.trailing-edge.com/decuslib10-10/01/43,50516/basxct.mac.html ;VERSION 17E 2-OCT-74/NA [...] ;VERSION 13 15-SEP-1969 [...] ;SQRT(F) IS CALCULATED BY N LINEAR APPROXIMATION, THE NATURE ;OF WHICH DEPENDS ON WHETHER 1/4 < F < 1/2 OR 1/2 < F < 1, ;FOLLOWED BY TWO ITERATIONS OF NEWTON'S METHOD. [...] SQRTB0: ASHC T, -33 ;PUT EXPONENT IN T, FRACTION IN T1 SUBI T, 201 ;SUBTRACT 201 FROM EXPONENT ROT T, -1 ;CUT EXP IN HALF, SAVE ODD BIT HRRM T,EX1 ;SAVE FOR FUTURE SCALING OF ANS ;IN FSC N,. INSTRUCTION LSH T, -43 ;GET BIT SAVED BY PREVIOUS INST. ASH T1, -10 ;PUT FRACTION IN PROPER POSITION FSC T1, 177(T) ;PUT EXPONENT OF FRACT TO -1 OR 0 MOVEM T1, N ;SAVE IT. 1/4 < F < 1 FMP T1, SQCON1(T) ;LINEAR FIRST APPROX,DEPENDS ON FAD T1, SQCON2(T) ;WHETHER 1/4 < F < 1/2 OR 1/2 < F < 1. MOVE T, N ;START NEWTONS METHOD WITH FRAC FDV T, T1 ;CALCULATE X(0)/X(1) FAD T1, T ;X(1) + X(0)/X(1) FSC T1, -1 ;1/2(X(1) + X(0)/X(1)) FDV N, T1 ;X(0)/X(2) FADR N, T1 ;X(2) + X(0)/X(2) XCT EX1 SQRT1: POPJ P, ;EXIT SQCON1: 0.8125 ;CONSTANT, USED IF 1/4 < FRAC < 1/2 0.578125 ;CONSTANT, USED IF 1/2 < FRAC < 1 SQCON2: 0.302734 ;CONSTANT, USED IF 1/4 < FRAC < 1/2 0.421875 ;CONSTANT, USED IF 1/2 < FRAC < 1
-- Robert Munafo -- mrob.com Follow me at: fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com- youtube.com/user/mrob143 - rilybot.blogspot.com
One has to keep in mind the _context_ of these various codes. The fast invsqrt hack was intended to _approximate_ the invsqrt very quickly without worrying about the last few bits of precision. Most library routines have a higher goal -- attempting to squeeze out every bit of precision at the expense of some speed. This is why the sqrt/invsqrt from the _vision/graphics_ library at MIT or Stanford or CMU or Utah would be more interesting: these customers of code were more interested in raw speed than in the last few bits of precision. It is also important to keep in mind that the tradeoffs of different machines are different. In today's "RISC" architectures, most operations attempt to complete within one cycle, but they still "stall" on a load that misses in the cache. So on modern architectures, _table lookups_ can be extremely expensive, unless you have good reason to believe that the table remains in the cache. In the architectures of the 60's & 70's, there could be great differences between the execution times of different instructions. On some machines, even a simple _shift_ could take time proportional to the amount of the shift. On some machines, a multiply might take 30 times longer than an ADD or an AND. So without having a timing card handy when analyzing the code, it would be difficult to assess whether it is faster to shift an entire word, or to pick it apart. In particular, given the slow speed on some machines of FP operations, one would be willing to spend quite a number of fixed point operations to save one Newton iteration. The acid test would be whether someone _today_, given all of the accumulated knowledge, could program one of these machines of yesterday to produce a faster sqrt/invsqrt than was done in 1970 (say). At 04:21 AM 4/17/2011, Robert Munafo wrote:
Henry,
Even without a list of PDP-10 opcodes, this clearly doesn't fulfill your original question because it is far too many instructions. Also, it seems to address a data (constant) area differently based on the low bit of the (pre-shifted) exponent.
The whole point of your original question is that everything is done by shifting the exponent and mantissa together as a monolithic object. You wrote:
When making an initial guess of sqrt(x) for a Newton iteration, where x is a positive floating point number, a good initial guess is the bits of the entire number (both exponent & mantissa) shifted right by 1. [...]
Given the "0x5f3759df" a.k.a. "Fast InvSqrt()" example, it makes sense to allow one constant to both correct the exponent offset and bias the range of the linear function (to bring its average closer to the parabola), but this code has 4 constants!
Similar comments would apply to that other PDP examples too: way too many constants.
Relevant parts with Fortran ".LT." changed to < for clarity:
http://pdp-10.trailing-edge.com/decuslib10-10/01/43,50516/basxct.mac.html ;VERSION 17E 2-OCT-74/NA [...] ;VERSION 13 15-SEP-1969 [...] ;SQRT(F) IS CALCULATED BY N LINEAR APPROXIMATION, THE NATURE ;OF WHICH DEPENDS ON WHETHER 1/4 < F < 1/2 OR 1/2 < F < 1, ;FOLLOWED BY TWO ITERATIONS OF NEWTON'S METHOD. [...] SQRTB0: ASHC T, -33 ;PUT EXPONENT IN T, FRACTION IN T1 SUBI T, 201 ;SUBTRACT 201 FROM EXPONENT ROT T, -1 ;CUT EXP IN HALF, SAVE ODD BIT HRRM T,EX1 ;SAVE FOR FUTURE SCALING OF ANS ;IN FSC N,. INSTRUCTION LSH T, -43 ;GET BIT SAVED BY PREVIOUS INST. ASH T1, -10 ;PUT FRACTION IN PROPER POSITION FSC T1, 177(T) ;PUT EXPONENT OF FRACT TO -1 OR 0 MOVEM T1, N ;SAVE IT. 1/4 < F < 1 FMP T1, SQCON1(T) ;LINEAR FIRST APPROX,DEPENDS ON FAD T1, SQCON2(T) ;WHETHER 1/4 < F < 1/2 OR 1/2 < F < 1. MOVE T, N ;START NEWTONS METHOD WITH FRAC FDV T, T1 ;CALCULATE X(0)/X(1) FAD T1, T ;X(1) + X(0)/X(1) FSC T1, -1 ;1/2(X(1) + X(0)/X(1)) FDV N, T1 ;X(0)/X(2) FADR N, T1 ;X(2) + X(0)/X(2) XCT EX1 SQRT1: POPJ P, ;EXIT SQCON1: 0.8125 ;CONSTANT, USED IF 1/4 < FRAC < 1/2 0.578125 ;CONSTANT, USED IF 1/2 < FRAC < 1 SQCON2: 0.302734 ;CONSTANT, USED IF 1/4 < FRAC < 1/2 0.421875 ;CONSTANT, USED IF 1/2 < FRAC < 1
-- Robert Munafo -- mrob.com Follow me at: fb.com/mrob27 - twitter.com/mrob_27 - mrob27.wordpress.com- youtube.com/user/mrob143 - rilybot.blogspot.com
participants (2)
-
Henry Baker -
Robert Munafo