[math-fun] Gauss-Seidel considered a pile of wombat's do's
I've been looking at animating the motion of some (rather nonstandard) robots: this involves interpreting the kinematic constraints as a bunch of simultaneous polynomial equations in several (7) variables with nullity 1, then tracing the motion path via continuous solution of the equations as a function of the time parameter. If possible, I want to solve these in real time --- i.e. fast enough to keep pace with the graphical display (say 30 frames per second). The classical Adams-Bashforth-Moulton (predictor-corrector) method for numerical solution of a differential equation seems a suitable vehicle for the task: the third order version costs only 2 iterations for acceptable accuracy. [For some reason I had thought first of Runge-Kutta-Merson, which however requires the differentials to be available as an explicit function of time.] To cast the motion equations as an ODE system, they must be differentiated and the resulting linear equations for the derivatives solved --- in practice numerically, at each step. Now, we already have to hand an approximate solution to these linear equations --- the derivatives at the previous step --- so an iterative algorithm looks worth trying. The textbook method is Gauss-Seidel, which (with luck, and a diagonally dominant matrix) converges geometrically. I have to admit I was already prejudiced; and after spending an evening establishing how cumbersome and ineffective this idea proves to be [at any rate, in this application], in desperation I cooked up a home-grown alternative. It's a matrix variation of an old trick assembly-coders used in the days before hardware division: starting from an approximation b to 1/a, iterate b := 2 b - a b^2. In the matrix context this becomes simply B := 2 B - B A B. It's easy to show that if I - A B = D, then for the next iterate B' we have I - A B' = D^2; so provided the spectral radius (maximum eigenvalue modulus) of D is d < 1, the algorithm converges and is second-order. If A is rectangular or singular, it converges instead to the Moore-Penrose pseudo-inverse. This is dead simple to program, and fast --- 2 iterations suffice for the application. Admittedly it is still slightly slower (maybe 25%) than solving the equations directly; though if matrix multiplication were hardwired --- as already in some graphics accelerators --- it might well be faster. Yet I don't recall ever seeing it mentioned in print --- can anybody out there enlighten me? To use it for matrix inversion in general, we'd need to construct an initial approximation B for which d < 1. This unfortunately looks nontrivial. [In the original real number context, it was worked around using the fact that binary representation gives an easy approx. base 2 log; and for reduced a ~ 1 we can just choose b = 1.] Fred Lunnon
* Fred lunnon <fred.lunnon@gmail.com> [May 20. 2006 08:46]:
[...]
It's a matrix variation of an old trick assembly-coders used in the days before hardware division: starting from an approximation b to 1/a, iterate b := 2 b - a b^2. In the matrix context this becomes simply B := 2 B - B A B. It's easy to show that if I - A B = D, then for the next iterate B' we have I - A B' = D^2;
Write B := B( 1 + (1-AB) ), (second order) rewrite as B(1+Y) (where Y is small), n-th order iteration is the B(1+Y+Y^2+...+Y^(n-1)) At least with the scalar iteration, never expand as you lose the small terms Y and control of convergence. The iterations is a special case (a=1) of the division-less iteration for the inverse a-th root. (section 15.3 "Iterations for the inverse a-th root", pp.408ff of the fxtbook gives the general form).
so provided the spectral radius (maximum eigenvalue modulus) of D is d < 1, the algorithm converges and is second-order. If A is rectangular or singular, it converges instead to the Moore-Penrose pseudo-inverse.
This is dead simple to program, and fast --- 2 iterations suffice for the application. Admittedly it is still slightly slower (maybe 25%) than solving the equations directly;
Using the third order iteration B(1+Y+Y^2) will likely give a speedup. Goldschmidt's algorithm (fxtbook sect.15.6, pp.416ff, especially p.418) may offer a way to parallelize the computations.
though if matrix multiplication were hardwired --- as already in some graphics accelerators --- it might well be faster. Yet I don't recall ever seeing it mentioned in print --- can anybody out there enlighten me?
search for "Nicholas J. Higham" http://www.ma.man.ac.uk/~higham/ Some papers of interest are \jjbibitem{apprmatlog}{Sheung Hun Cheng, Nicholas J.\ Higham, Charles S.\ Kenney, Alan J.\ Laub: \jjbibtitle{Approximating the logarithm of a matrix to specified accuracy}, SIAM Journal on Matrix Analysis and Applications, vol.22, no.4, pp.1112-1125, 2001.} % http://www.ee.ucla.edu/faculty/bios/laub.htm \jjbibitem{middleage}{Sheung Hun Cheng, Nicholas J.\ Higham, Charles S.\ Kenney, Alan J.\ Laub: \jjbibtitle{Return to the Middle Ages: A Half-Angle Iteration for the Logarithm of a Unitary Matrix}, .} \jjbibitem{matrixsign}{Nicholas J.\ Higham: \jjbibtitle{The Matrix Sign Decomposition and its Relation to the Polar Decomposition}, Linear Algebra and Appl., 212/213, pp.3-20, 1994. Online at \url{http://citeseer.ist.psu.edu/higham94matrix.html}.} \jjbibitem{matrixsqrt}{Nicholas J.\ Higham: \jjbibtitle{Stable Iterations for the Matrix Square Root}, Numerical Algorithms, vol.15, no.2, pp.227-242, 1997. Online at \url{http://www.ma.man.ac.uk/~higham/}.} \jjbibitem{matrixlog}{Nicholas J.\ Higham: \jjbibtitle{Evaluating Pad\'e Approximants of the Matrix Logarithm}, SIAM J.\ Matrix Anal.\ Appl.\, vol.22, no.4, pp.1126-1135, 2001. %Numerical Analysis Report no.358, Manchester Center for Computational Mathematics, England, March-2000.} Online at \url{http://www.ma.man.ac.uk/~higham/}.} \jjbibitem{matrixcosine}{N.\ J.\ Higham, M.\ I.\ Smith: \jjbibtitle{Computing the Matrix Cosine}, Numerical Analysis Report no.411, University of Manchester, September-2002.} % http://www.ma.man.ac.uk/MCCM/ % http://www.ma.man.ac.uk/~nareports/ \jjbibitem{matrixpolsign}{Nicholas J.\ Higham, D.\ Steven Mackey, Niloufer Mackey, Fran\c{c}oise Tisseur: \jjbibtitle{Computing the polar decomposition and the matrix sign decomposition in matrix groups}, Numerical Analysis Report no.426, Manchester Center for Computational Mathematics, England, April-2003. Revised November-2003.} CiteSeer should also be useful
To use it for matrix inversion in general, we'd need to construct an initial approximation B for which d < 1. This unfortunately looks nontrivial. [In the original real number context, it was worked around using the fact that binary representation gives an easy approx. base 2 log; and for reduced a ~ 1 we can just choose b = 1.]
The last section "More on Iterative Improvement" of chapter 2.5 of "Numerical Recipes in C" (2.ed.), p.57..58 has some remarks. Note that the book has issues, especially with the code! Found nothing about the method in Golub, Van Loan: "Matrix Computation".
On 5/20/06, Joerg Arndt <arndt@jjj.de> wrote:
... Write B := B( 1 + (1-AB) ), (second order) rewrite as B(1+Y) (where Y is small), n-th order iteration is the B(1+Y+Y^2+...+Y^(n-1))
At least with the scalar iteration, never expand as you lose the small terms Y and control of convergence.
The iterations is a special case (a=1) of the division-less iteration for the inverse a-th root. (section 15.3 "Iterations for the inverse a-th root", pp.408ff of the fxtbook gives the general form).
Using the third order iteration B(1+Y+Y^2) will likely give a speedup. ...
Um... But of course, fourth-order would be no faster. Nice that the binomial theorem still works here!
search for "Nicholas J. Higham" http://www.ma.man.ac.uk/~higham/
Some papers of interest are
\jjbibitem{apprmatlog}{Sheung Hun Cheng, Nicholas J.\ Higham, Charles S.\ Kenney, Alan J.\ Laub: \jjbibtitle{Approximating the logarithm of a matrix to specified accuracy}, SIAM Journal on Matrix Analysis and Applications, vol.22, no.4, pp.1112-1125, 2001.} % http://www.ee.ucla.edu/faculty/bios/laub.htm ...
Thanks for the references --- lots of interesting stuff there! Shame that they all posit that matrix inversion is on tap, though. By the way, I hadn't come across the "matrix sign" function before. The iteration given for sign(A) is B := A; B := (B + B^{-1})/2 which raises a couple of questions in my mind. Firstly, is it not possible to involve A in the iteration, and so avoid the computation drifting as a result of rounding error or ill-conditioning? Secondly, as it stands the iteration is inapplicable to rectangular matrices [or singular matrices, besides failing for iimaginary eigenvalues]. Suppose instead we modify it to the more amenable B := A; B := (B + B^{+T})/2 (average B with its transposed pseudo-inverse). For real nonsingular square A, the result appears to converge to an orthogonal matrix B such that A = B C with C symmetric: the "polar decomposition" of A. Presumably this must be well-known; but oddly it doesn't seem to be mentioned in in Higham et al ...
To use it for matrix inversion in general, we'd need to construct an initial approximation B for which d < 1. This unfortunately looks nontrivial. [In the original real number context, it was worked around using the fact that binary representation gives an easy approx. base 2 log; and for reduced a ~ 1 we can just choose b = 1.]
It occurred to me afterwards that what I'm really asking for here is an initial B such that A B is diagonally dominant --- exactly the same as required to make Gauss-Seidel applicable! Now I wonder which box I packed "Numerical Recipies" in ... Thanks again, Fred Lunnon
On 5/20/06, Fred lunnon <fred.lunnon@gmail.com> wrote:
... Suppose instead we modify it to the more amenable B := A; B := (B + B^{+T})/2 (average B with its transposed pseudo-inverse). For real nonsingular square A, the result appears to converge to an orthogonal matrix B such that A = B C with C symmetric: the "polar decomposition" of A. Presumably this must be well-known; but oddly it doesn't seem to be mentioned in in Higham et al ...
Ah yes, it is --- (3.6) on page 6 of JA's 3rd reference: \jjbibitem{matrixsign}{Nicholas J.\ Higham: \jjbibtitle{The Matrix Sign Decomposition and its Relation to the Polar Decomposition}, Linear Algebra and Appl., 212/213, pp.3-20, 1994. Online at \url{http://citeseer.ist.psu.edu/higham94matrix.html}.} WFL
* Fred lunnon <fred.lunnon@gmail.com> [May 21. 2006 13:57]:
On 5/20/06, Joerg Arndt <arndt@jjj.de> wrote: [...]
By the way, I hadn't come across the "matrix sign" function before. The iteration given for sign(A) is B := A; B := (B + B^{-1})/2 which raises a couple of questions in my mind. [...]
There are a number of connections between matrix sign, sqrt, polar decomposition, and re-orthogonalization. I wrote something down in the fxtbook: Section 15.5 "Some applications of the matrix square root" pp.412ff. I'd be extremely grateful for feedback on that section. Most important (IMHO) you do not need the inversion at all. Note that I cowardly omitted the "where/when will it converge" issue. P.S.: the server www.jjj.de will be down for the next few hours.
participants (2)
-
Fred lunnon -
Joerg Arndt