Re: [math-fun] Interesting matrix problem
Oops!! There still is a possibility of *worsening* the error if the noise causes a sign change in one of the non-principal rows or columns. A more robust solution would be to compute the correlation of the other rows/columns with the principal rows/columns, and choose the sign suggested by the correlation. If the abs value of the correlation is too low, don't bother including it in the sum, as it is 'obviously' all noise. The 'textbook' solution using SVD's is looking better all the time, as the SVD *automatically* finds the *closest* rank-1 decomposition of the input matrix, which doesn't even have to pretend to have been a rank-1 matrix. At 09:34 AM 2/6/2021, Henry Baker wrote:
"Find the element in the matrix with the largest absolute value. Use the row and column containing that element, and ignore everything else."
The following idea might rescue another few bits from the noise:
Call the elt with the max abs value "maxelt", but maxelt keeps its sign.
Call the row and column with "maxelt" the "principal row" and the "principal" column. Let 'pi' be the row# for the principal row and 'pj' be the col# for the principal col. Note that maxelt = M[pi,pj].
Sum the rows as follows:
rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt)
Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]).
This makes the sign of the elt of the pj'th column of the i'th row *match* the sign of maxelt, and thereby avoids cancellation.
Also, since we aren't scaling these other (smaller) rows up, we won't be amplifying the noise.
Ditto for the column sums:
colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt)
Now use the row sums and column sums (suitably scaled) as our answer.
At 08:37 PM 2/4/2021, Andy Latto wrote:
Given that we know that the error is largest for the smallest elements, and all the columns are scalar multiples of any given row (and equivalently the same for the columns), is this algorithm significantly better than:
Find the element in the matrix with the argest absolute value.
Use the row and column containing that element, and ignore everything else.
Andy
On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1@pipeline.com> wrote:
We first show how to solve the problem when M is the outer product of *non-negative real* vectors.
Lemma 1.
Suppose we have a matrix M that purports to have the following structure:
[a b c d]' [p q r s] =
[ a p a q a r a s ] [ ] [ b p b q b r b s ] [ ] [ c p c q c r c s ] [ ] [ d p d q d r d s ]
where >>> all of a,b,c,d,p,q,r,s >= 0 <<<
The row sums are
[a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + p)]
The column sums are
[(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) s]
The total matrix sum is
(d + c + b + a) (s + r + q + p)
If M is the zero matrix, then M is the outer product of 2 zero vectors.
If M isn't the zero matrix, then the row sum>0, the col sum>0, and the total sum>0.
So we construct U' V = M as follows.
U = the row sums:
[a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + p)]
V = the column sums/(total sum):
p q r s [-------------, -------------, -------------, -------------] s + r + q + p s + r + q + p s + r + q + p s + r + q + p
Or, we can split the scale factor (total sum) evenly:
U = row sums/sqrt(total sum):
a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d sqrt(s + r + q + p) [---------------------, ---------------------, ---------------------, ---------------------] sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a)
V = col sums/sqrt(total sum):
sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r sqrt(d + c + b + a) s [---------------------, ---------------------, ---------------------, ---------------------] sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p)
So these scaled row-sums and scaled col-sums are the vectors we seek. QED
NB: It should now be clear why the row-sums and col-sums won't work when we include negative elements; e.g., there exist matrices M whose row-sums and col-sums are both *zero*, so the procedure above fails completely. ---
Now we handle the general case, where M consists of both positive and negative entries.
We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), where abs(M) is M with all of its entries replaced by their absolute values.
Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the following scheme.
Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending order of absolute value.
The largest element of M is the product of the largest element of U and the largest element of V. Assume that this element is |b|*|r| = |M[i,j]|.
We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that the sign(b)*sign(r)=sign(M[i,j]).
Since this is the first assignment, we can arbitrarily choose sign(b)=+1, and then set sign(r)=sign(M[i,j]).
We then process each of the other elements of U and V in descending order of absolute values, choosing signs for the remaining elements of U and the remaining elements of V so they are *consistent* with the previous choices of signs.
If any M[i,j]=0, it has no effect on any choice of signs for elements of U,V.
Due to round-off or other errors, the assignment near the end of this process -- i.e., when the elements of U and V are very close to zero, and we may find inconsistencies. However, these inconsistencies are extremely small (assuming that the input matrix was indeed produced as an outer product), and by processing the elements of U,V in descending order of absolute value, we ensure that the signs are consistent for the *strongest signals*, so that any inconsistencies are due to small round-off errors.
At 05:02 PM 2/3/2021, Henry Baker wrote:
Dan asks: what is the question?
Find the (real) vectors U,V whose outer product makes the (real) matrix; i.e.,
U' V = M
Clearly, for any real alpha,
(U/alpha)' (alpha*V) = M
so the vectors are determined only up to a single constant factor.
Once again, the '=' wants to be as close as possible within the constraints of FP arithmetic.
At 03:23 PM 2/3/2021, Henry Baker wrote:
Input: an nxn(*) matrix M of floating point numbers (positive, negative or zero) which is *supposed* to have rank exactly one, i.e., it is the outer product of two n-element *unknown* vectors, and hence every row has a common factor and every column has a common factor, which factor can be positive, negative or zero.
(*) The matrix doesn't have to be square, but it was easier to describe the square version of the problem
Unfortunately, due to floating point approximations and and other issues, the values of M have small errors, so the columns and rows are not *exact* multiples of their common factors, and if the common factor is supposed to be zero, it may actually be a number exceedingly close to zero but at the minimum exponent range.
NB: Due to *underflow*, floating point algebra has *zero divisors*, i.e., there exist x/=0, y/=0, s.t. x*y=0.
Such a problem can trivially be posed as an SVD problem, but I wanted a more direct solution, because I wanted to get the 'best' fit to the data within the limits of the floating point arithmetic.
I have a quite elegant solution and reasonably efficient solution, but I wanted to see if anyone else had the same idea. -- Andy.Latto@pobox.com
participants (1)
-
Henry Baker