* Fred lunnon <fred.lunnon@gmail.com> [May 20. 2006 08:46]:
[...]
It's a matrix variation of an old trick assembly-coders used in the days before hardware division: starting from an approximation b to 1/a, iterate b := 2 b - a b^2. In the matrix context this becomes simply B := 2 B - B A B. It's easy to show that if I - A B = D, then for the next iterate B' we have I - A B' = D^2;
Write B := B( 1 + (1-AB) ), (second order) rewrite as B(1+Y) (where Y is small), n-th order iteration is the B(1+Y+Y^2+...+Y^(n-1)) At least with the scalar iteration, never expand as you lose the small terms Y and control of convergence. The iterations is a special case (a=1) of the division-less iteration for the inverse a-th root. (section 15.3 "Iterations for the inverse a-th root", pp.408ff of the fxtbook gives the general form).
so provided the spectral radius (maximum eigenvalue modulus) of D is d < 1, the algorithm converges and is second-order. If A is rectangular or singular, it converges instead to the Moore-Penrose pseudo-inverse.
This is dead simple to program, and fast --- 2 iterations suffice for the application. Admittedly it is still slightly slower (maybe 25%) than solving the equations directly;
Using the third order iteration B(1+Y+Y^2) will likely give a speedup. Goldschmidt's algorithm (fxtbook sect.15.6, pp.416ff, especially p.418) may offer a way to parallelize the computations.
though if matrix multiplication were hardwired --- as already in some graphics accelerators --- it might well be faster. Yet I don't recall ever seeing it mentioned in print --- can anybody out there enlighten me?
search for "Nicholas J. Higham" http://www.ma.man.ac.uk/~higham/ Some papers of interest are \jjbibitem{apprmatlog}{Sheung Hun Cheng, Nicholas J.\ Higham, Charles S.\ Kenney, Alan J.\ Laub: \jjbibtitle{Approximating the logarithm of a matrix to specified accuracy}, SIAM Journal on Matrix Analysis and Applications, vol.22, no.4, pp.1112-1125, 2001.} % http://www.ee.ucla.edu/faculty/bios/laub.htm \jjbibitem{middleage}{Sheung Hun Cheng, Nicholas J.\ Higham, Charles S.\ Kenney, Alan J.\ Laub: \jjbibtitle{Return to the Middle Ages: A Half-Angle Iteration for the Logarithm of a Unitary Matrix}, .} \jjbibitem{matrixsign}{Nicholas J.\ Higham: \jjbibtitle{The Matrix Sign Decomposition and its Relation to the Polar Decomposition}, Linear Algebra and Appl., 212/213, pp.3-20, 1994. Online at \url{http://citeseer.ist.psu.edu/higham94matrix.html}.} \jjbibitem{matrixsqrt}{Nicholas J.\ Higham: \jjbibtitle{Stable Iterations for the Matrix Square Root}, Numerical Algorithms, vol.15, no.2, pp.227-242, 1997. Online at \url{http://www.ma.man.ac.uk/~higham/}.} \jjbibitem{matrixlog}{Nicholas J.\ Higham: \jjbibtitle{Evaluating Pad\'e Approximants of the Matrix Logarithm}, SIAM J.\ Matrix Anal.\ Appl.\, vol.22, no.4, pp.1126-1135, 2001. %Numerical Analysis Report no.358, Manchester Center for Computational Mathematics, England, March-2000.} Online at \url{http://www.ma.man.ac.uk/~higham/}.} \jjbibitem{matrixcosine}{N.\ J.\ Higham, M.\ I.\ Smith: \jjbibtitle{Computing the Matrix Cosine}, Numerical Analysis Report no.411, University of Manchester, September-2002.} % http://www.ma.man.ac.uk/MCCM/ % http://www.ma.man.ac.uk/~nareports/ \jjbibitem{matrixpolsign}{Nicholas J.\ Higham, D.\ Steven Mackey, Niloufer Mackey, Fran\c{c}oise Tisseur: \jjbibtitle{Computing the polar decomposition and the matrix sign decomposition in matrix groups}, Numerical Analysis Report no.426, Manchester Center for Computational Mathematics, England, April-2003. Revised November-2003.} CiteSeer should also be useful
To use it for matrix inversion in general, we'd need to construct an initial approximation B for which d < 1. This unfortunately looks nontrivial. [In the original real number context, it was worked around using the fact that binary representation gives an easy approx. base 2 log; and for reduced a ~ 1 we can just choose b = 1.]
The last section "More on Iterative Improvement" of chapter 2.5 of "Numerical Recipes in C" (2.ed.), p.57..58 has some remarks. Note that the book has issues, especially with the code! Found nothing about the method in Golub, Van Loan: "Matrix Computation".