[math-fun] Lewis Carroll's determinant method and some (new?) uses
Lewis Carroll's determinant method ====Warren D. Smith===March 2014==== http://www.maa.org/sites/default/files/pdf/Mathhorizons/pdfs/nov_2006_pp.12-... http://mathtourist.blogspot.com/2007/03/lewis-carroll-and-his-telescoping.ht... Lewis Carroll (Rev. Charles L. Dodgson) in 1866 gave the following method to evaluate det(M) where M is an nXn matrix. At each step, we have a kXk matrix which is "condensed" to a (k-1)X(k-1) matrix. Finally, a 1X1 matrix, i.e. single number, is got, which is the desired determinant. ALGORITHM: We begin with M0 = M. initial step (k=n): M1 = the 2x2 determinants of adjacent entries in M0. Note M1 is (n-1)X(n-1). for(j=2 .. n-1){ M[j] = the 2x2 determinants of adjacent entries in M[j-1]; Note M[j] is (n-j)X(n-j). This is not yet the final form of M[j]. for(a=1..n-j){ for(b=1..n-j){ Divide entry a,b of M[j] by entry a+1,b+1 of M[j-2]. } } That yields the final form of M[j], the j-condensed version of M. } The sole entry of M[n-1] is now the desired determinant. END OF ALGORITHM. This process requires O(n^3) operations (each +,-,/,*) and fails if at some point it divides by zero. [However, by performing some random row operations before beginning, we usually can eliminate internal zeros so the process will not fail. Alternatively, you could replace all 0s by epsilons, continue on symbolically, and at the end take the limit epsilon-->0. Both approaches are a pain.] Of course we only need to store a constant number of the M[j] matrices, not all of them, hence the total storage needs are O(n^2) entries, not order n^3. WHY SHOULD ANYBODY CARE? 1. I point out, which remark I have not seen before, that Carroll's condensation process preserves certain special matrix forms. If M is Toeplitz, Hankel, diagonal, or banded with bandwidth=w, then all the condensed matrices also are. If the original matrix was all-integer, then all the condensed matrices also are. As a result, it seems to me that this yields an O(n^2) step algorithm for computing the determinants of Toeplitz and Hankel matrices, and an O(n^2 * w) step algorithm for matrices with bandwidth=w, and an O(n) algorithm for diagonal matrices. This Toeplitz result seems quite nice, although probably asymptotically faster methods are possible (?). 2. The RESULTANT of two polynomials is the determinant of the Sylvester matrix, http://en.wikipedia.org/wiki/Sylvester_matrix, which is made by gluing together two matrices which are rectangular chunks of Toeplitz matrices. The Carroll j-condensed forms of Sylvester matrices, also have Sylvester form, except for j extra rows inserted between the two Toeplitz rectangles... does this lead to a quadratic-time resultant algorithm? Unfortunately I am not seeing any obvious reason why this process can be made to run in only quadratic time for resultants. 3. David Bressoud and James Propp have remarked that Carroll's method is highly parallelizable. It also exhibits excellent memory-locality. On n^2 processors, each condensation will run in O(1) time, causing the whole determinant process to take time O(n). 4. There are also other condensation processes similar to (but different from) Carroll's, including Chio's process (M.F.Chio 1853, http://mathworld.wolfram.com/ChioPivotalCondensation.html ) , and methods which involve 3x3 rather than 2x2 determinants of adjacent entries, with shrinkage by 2 not 1 at each step. Chio's process seems better than Carroll's in the respect that it is easier to avoid dividing by zero. 5. If you solve an nXn linear system using Cramer's rule, then each entry of the solution vector is got in O(n) time using n^2 processors i.e. O(n^3) total work. The full solution vector then would naively require n^4 work. However it is possible to share work and thus reduce the total work back to O(n^3). This is described for a 3x3 version of Chio's condensation process in: Ken Habgood & Itamar Arel: A condensation-based application of Cramerʼs rule for solving large-scale linear systems, J. Discrete Algorithms 10 (January 2012) 98-109. http://web.eecs.utk.edu/~itamar/Papers/JDA2011.pdf Their work-sharing scheme is a bit of a pain, although they succeeded in programming and testing it. Its underlying idea is to regard the matrix as a binary tree (splits by vertical cuts) and note that any two Cramer determinants share a lot of their computation-trees, so by memoizing subtree results you avoid redoing those subtrees. There is a picture in their paper which makes it clearer how they do that. It seems to me a similar scheme would also work for Carroll's 2x2 process described here, although I have not carefully checked. In any case, it seems to me that the primary benefit of Habgood & Arel is one they did not mention (!!) -- that their approach is highly parallelizable. It will solve an nXn linear system in O(n) time with n^2 parallel processors. Habgood & Arel encountered some numerical problems. There is an analogue of "pivoting" which could be used to improve the numerics of the Chio-based determinant computation, but if you use it then it screws up the Habgood-Arel work-sharing scheme. But a different linear-system-solving method which also works in O(n) time with n^2 parallel processors is conjugate gradient method. And this method is considerably simpler to program and uses only O(n) storage beyond the original matrix. Further, CG can also be run less than all the way to completion and still will often compute good approximate answers. 6. That suggests seeking a task that condensation with work-sharing can do, which CG cannot do. How about computing the INVERSE of an nXn matrix? Using its explicit formula in terms of the n^2 minors, each (n-1)x(n-1), of the matrix (as well as the determinant of the original matrix). The naive implementation of this would compute the inverse in order n^5 total work. However, if we can share work enough, then this would be reduced to O(n^3) work in O(n) time with n^2 processors. (A new record?) I believe this can actually be done, using a treelike scheme resembling Habgood & Arel's, but based on a "quadtree" (both horizontal & vertical splitting) rather than a binary horiz-splits-only tree. 7. Certain matrices whose a,b entries are given by formulas in terms of a and b, yield condensed forms also given by other formulas,... which in some cases enables you to compute the determinant in closed form. 8. Also might be useful for thinking about determinants of "random" matrices. SUMMARY: New(?) results are an O(n) time n-processor parallel det-algorithm for Toeplitz and Hankel matrices (but if there is a similar way to do resultants in this time bound, I am not seeing it), and an O(n) time parallel algorithm on n^2 processors for matrix inversion.
* Warren D Smith <warren.wds@gmail.com> [Mar 10. 2014 07:22]:
[...]
For Toeplitz matrices, there should be FFT methods. For circulants := diag(vec(v)) we certainly have det = prod of elements of Fourier transformed vec(v) The following should be useful: Victor Y. Pan: Structured matrices and polynomials: Unified Superfast Algorithms, Springer-Verlag, (2001) Best, jj
On 3/10/14, Joerg Arndt <arndt@jjj.de> wrote:
* Warren D Smith <warren.wds@gmail.com> [Mar 10. 2014 07:22]:
[...]
For Toeplitz matrices, there should be FFT methods.
Can you say a bit more about how this would work? Hankel determinants (of Toeplitz matrices) can be evaluated via what Conway & Guy christened the "wall of numbers" in their "Book of Numbers". This algorithm copes with zeros in the table, making it applicable also to finite field computations --- the |F_2 case is particularly simple. Or at least it would cope, if it weren't for a couple of nasty misprints noticed by Jim Propp; for the correct version see https://cs.uwaterloo.ca/journals/JIS/VOL4/LUNNON/numbwall10.html That paper also discusses parallel algorithms with time and space complexity of the same order n as the matrix, involving 1-D arrays of finite-state automata. [ Incidentally, the proof given there is analytical, rather lengthy and delicate. It does at least possess the virtue of presumable correctness. Which is better than can be said for the argument appearing in another earlier publication, by pair of gentlemen mustering ingenuity sufficient to recycle the preprint and claim attribution, but evidently remaining in ignorance of its subsequent retraction by the original author --- who, under the circumstances, is maliciously content to remain anonymous. The result has since been buttressed by an elegant and more general algebraic argument, which (typically) remains unpublished. ] Fred Lunnon
For circulants := diag(vec(v)) we certainly have det = prod of elements of Fourier transformed vec(v)
The following should be useful: Victor Y. Pan: Structured matrices and polynomials: Unified Superfast Algorithms, Springer-Verlag, (2001)
Best, jj
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
I should make it clear that Conway & Guy did indeed acknowledge me in their book. The meticulous if unimaginative adopters of my early draft of a dud proof (one of a monotonously increasing sequence of similar lemons) were by contrast francophones, whose identities deserve no further publicity. On 3/10/14, Warren D Smith <warren.wds@gmail.com> wrote:
... --that paper is rather immensely long and annoying to read,...
Not half as long and annoying as it was to write! For some reason, this subject is replete with results which appear straightforward to establish, while their eventual proofs prove otherwise. A related example is Berlekamp's algorithm, of which I have yet to encounter a demonstration I feel confident that I fully understand.
but I suspect some of this "number wall" stuff must be equivalent to Carroll condensation applied to Toeplitz matrices, i.e. the latter was already known under other names. Incidentally there are still more names such as "lozenge algorithms" out there.
The elementary "number wall" recurrence goes back at least to Jacobi; I first encountered it in the form of Rutishauser's "QD" algorithm for matrix eigenvalues. The nontrivial challenge comes along when that demands division by zero, which must somehow be circumvented ... Fred Lunnon On 3/10/14, Fred Lunnon <fred.lunnon@gmail.com> wrote:
On 3/10/14, Joerg Arndt <arndt@jjj.de> wrote:
* Warren D Smith <warren.wds@gmail.com> [Mar 10. 2014 07:22]:
[...]
For Toeplitz matrices, there should be FFT methods.
Can you say a bit more about how this would work?
Hankel determinants (of Toeplitz matrices) can be evaluated via what Conway & Guy christened the "wall of numbers" in their "Book of Numbers". This algorithm copes with zeros in the table, making it applicable also to finite field computations --- the |F_2 case is particularly simple.
Or at least it would cope, if it weren't for a couple of nasty misprints noticed by Jim Propp; for the correct version see https://cs.uwaterloo.ca/journals/JIS/VOL4/LUNNON/numbwall10.html
That paper also discusses parallel algorithms with time and space complexity of the same order n as the matrix, involving 1-D arrays of finite-state automata.
[ Incidentally, the proof given there is analytical, rather lengthy and delicate. It does at least possess the virtue of presumable correctness.
Which is better than can be said for the argument appearing in another earlier publication, by pair of gentlemen mustering ingenuity sufficient to recycle the preprint and claim attribution, but evidently remaining in ignorance of its subsequent retraction by the original author --- who, under the circumstances, is maliciously content to remain anonymous.
The result has since been buttressed by an elegant and more general algebraic argument, which (typically) remains unpublished. ]
Fred Lunnon
For circulants := diag(vec(v)) we certainly have det = prod of elements of Fourier transformed vec(v)
The following should be useful: Victor Y. Pan: Structured matrices and polynomials: Unified Superfast Algorithms, Springer-Verlag, (2001)
Best, jj
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
* Fred Lunnon <fred.lunnon@gmail.com> [Mar 10. 2014 19:46]:
On 3/10/14, Joerg Arndt <arndt@jjj.de> wrote:
* Warren D Smith <warren.wds@gmail.com> [Mar 10. 2014 07:22]:
[...]
For Toeplitz matrices, there should be FFT methods.
Can you say a bit more about how this would work?
I wish I could. It's many years ago I looked into that and ended up with no notes, that indicates lack of success on my side. I vaguely remember that one has to use a larger matrix that somehow makes the problem look like the one for circulants. No more than that, sorry. Best, jj
[...]
* Joerg Arndt <arndt@jjj.de> [Mar 12. 2014 18:23]:
[...]
I vaguely remember that one has to use a larger matrix that somehow makes the problem look like the one for circulants. No more than that, sorry.
And that recollection was almost certainly wrong. Within a few hours I can only see that there are FT methods for circulants, and that the weighted FT does the same for f-ciculants, as defined on p.132 of Dario Bini, Victor Y.\ Pan: {Polynomial and matrix computations, vol.1: Fundamental algorithms}, Birkh\"{a}user, (1994). %\jjfile{bini-pan-poly-matrix-algo.djvu} Richard Brent could surely tell what the state of knowledge is. Best, jj
[...]
participants (3)
-
Fred Lunnon -
Joerg Arndt -
Warren D Smith