Re: The 'textbook' solution using SVD's is looking better
all the time, as the SVD *automatically* finds the *closest*
rank-1 decomposition of the input matrix, which doesn't
even have to pretend to have been a rank-1 matrix.
Indeed, I tried a Jacobi-like SVD algorithm on some of
these purported rank-1 matrices, and it converges extremely
quickly because -- unlike non-rank-1 matrices -- each of
these Jacobi/Givens rotations zeros out entire rows and
columns and furthermore, these rotations don't screw up
the previously zeroed rows and columns.
A major downside to the SVD algorithm on these rank-1
matrices is that the row and column produced has been
*normalized* to 1, and the *scale factor*, the first
(the largest, and hopefully only non-zero) singular
value is *irrational*.
The other (non-'principal') singular values characterize
the 'noise', which might be helpful if you're trying
to reduce this noise.
At 12:39 PM 2/6/2021, Henry Baker wrote:
>Oops!!
>
>There still is a possibility of *worsening* the error if the noise
>causes a sign change in one of the non-principal rows or columns.
>
>A more robust solution would be to compute the correlation of
>the other rows/columns with the principal rows/columns, and
>choose the sign suggested by the correlation.
>
>If the abs value of the correlation is too low, don't bother
>including it in the sum, as it is 'obviously' all noise.
>
>The 'textbook' solution using SVD's is looking better all
>the time, as the SVD *automatically* finds the *closest*
>rank-1 decomposition of the input matrix, which doesn't
>even have to pretend to have been a rank-1 matrix.
>
>At 09:34 AM 2/6/2021, Henry Baker wrote:
>>"Find the element in the matrix with the largest absolute value.
>>Use the row and column containing that element, and ignore
>>everything else."
>>
>>The following idea might rescue another few bits from the noise:
>>
>>Call the elt with the max abs value "maxelt", but maxelt keeps its
>>sign.
>>
>>Call the row and column with "maxelt" the "principal row" and the
>>"principal" column. Let 'pi' be the row# for the principal row and
>>'pj' be the col# for the principal col. Note that maxelt = M[pi,pj].
>>
>>Sum the rows as follows:
>>
>>rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt)
>>
>>Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]).
>>
>>This makes the sign of the elt of the pj'th column of the i'th
>>row *match* the sign of maxelt, and thereby avoids cancellation.
>>
>>Also, since we aren't scaling these other (smaller) rows up, we
>>won't be amplifying the noise.
>>
>>Ditto for the column sums:
>>
>>colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt)
>>
>>Now use the row sums and column sums (suitably scaled) as our
>>answer.
>>
>>At 08:37 PM 2/4/2021, Andy Latto wrote:
>>>Given that we know that the error is largest for the smallest elements, and
>>>all the columns are scalar multiples of any given row (and equivalently the
>>>same for the columns), is this algorithm significantly better than:
>>>
>>>Find the element in the matrix with the argest absolute value.
>>>
>>>Use the row and column containing that element, and ignore everything else.
>>>
>>>Andy
>>>
>>>On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1(a)pipeline.com> wrote:
>>>> We first show how to solve the problem when M is the outer product of
>>>> *non-negative real* vectors.
>>>>
>>>> Lemma 1.
>>>>
>>>> Suppose we have a matrix M that purports to have the following structure:
>>>>
>>>> [a b c d]' [p q r s] =
>>>>
>>>> [ a p a q a r a s ]
>>>> [ ]
>>>> [ b p b q b r b s ]
>>>> [ ]
>>>> [ c p c q c r c s ]
>>>> [ ]
>>>> [ d p d q d r d s ]
>>>>
>>>> where >>> all of a,b,c,d,p,q,r,s >= 0 <<<
>>>>
>>>> The row sums are
>>>>
>>>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q +
>>>> p)]
>>>>
>>>> The column sums are
>>>>
>>>> [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a)
>>>> s]
>>>>
>>>> The total matrix sum is
>>>>
>>>> (d + c + b + a) (s + r + q + p)
>>>>
>>>> If M is the zero matrix, then M is the outer product of 2 zero vectors.
>>>>
>>>> If M isn't the zero matrix, then the row sum>0, the col sum>0, and the
>>>> total sum>0.
>>>>
>>>> So we construct U' V = M as follows.
>>>>
>>>> U = the row sums:
>>>>
>>>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q +
>>>> p)]
>>>>
>>>> V = the column sums/(total sum):
>>>>
>>>> p q r s
>>>> [-------------, -------------, -------------, -------------]
>>>> s + r + q + p s + r + q + p s + r + q + p s + r + q + p
>>>>
>>>> Or, we can split the scale factor (total sum) evenly:
>>>>
>>>> U = row sums/sqrt(total sum):
>>>>
>>>> a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d
>>>> sqrt(s + r + q + p)
>>>> [---------------------, ---------------------, ---------------------,
>>>> ---------------------]
>>>> sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a)
>>>> sqrt(d + c + b + a)
>>>>
>>>> V = col sums/sqrt(total sum):
>>>>
>>>> sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r
>>>> sqrt(d + c + b + a) s
>>>> [---------------------, ---------------------, ---------------------,
>>>> ---------------------]
>>>> sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p)
>>>> sqrt(s + r + q + p)
>>>>
>>>> So these scaled row-sums and scaled col-sums are the vectors we seek. QED
>>>>
>>>> NB: It should now be clear why the row-sums and col-sums won't work when
>>>> we include negative elements; e.g., there exist matrices M whose row-sums
>>>> and col-sums are both *zero*, so the procedure above fails completely.
>>>> ---
>>>>
>>>> Now we handle the general case, where M consists of both positive and
>>>> negative
>>>> entries.
>>>>
>>>> We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M),
>>>> where abs(M) is M with all of its entries replaced by their absolute
>>>> values.
>>>>
>>>> Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the
>>>> following scheme.
>>>>
>>>> Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending
>>>> order of absolute value.
>>>>
>>>> The largest element of M is the product of the largest element of U and the
>>>> largest element of V. Assume that this element is |b|*|r| = |M[i,j]|.
>>>>
>>>> We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that
>>>> the sign(b)*sign(r)=sign(M[i,j]).
>>>>
>>>> Since this is the first assignment, we can arbitrarily choose sign(b)=+1,
>>>> and then set sign(r)=sign(M[i,j]).
>>>>
>>>> We then process each of the other elements of U and V in descending order
>>>> of
>>>> absolute values, choosing signs for the remaining elements of U and the
>>>> remaining elements of V so they are *consistent* with the previous choices
>>>> of signs.
>>>>
>>>> If any M[i,j]=0, it has no effect on any choice of signs for elements of
>>>> U,V.
>>>>
>>>> Due to round-off or other errors, the assignment near the end of this
>>>> process --
>>>> i.e., when the elements of U and V are very close to zero, and we may find
>>>> inconsistencies. However, these inconsistencies are extremely small
>>>> (assuming
>>>> that the input matrix was indeed produced as an outer product), and by
>>>> processing the elements of U,V in descending order of absolute value, we
>>>> ensure that the signs are consistent for the *strongest signals*, so that
>>>> any inconsistencies are due to small round-off errors.
>>>>
>>>> At 05:02 PM 2/3/2021, Henry Baker wrote:
>>>> >Dan asks: what is the question?
>>>> >
>>>> >Find the (real) vectors U,V whose outer product
>>>> >makes the (real) matrix; i.e.,
>>>> >
>>>> >U' V = M
>>>> >
>>>> >Clearly, for any real alpha,
>>>> >
>>>> >(U/alpha)' (alpha*V) = M
>>>> >
>>>> >so the vectors are determined only up to
>>>> >a single constant factor.
>>>> >
>>>> >Once again, the '=' wants to be as close
>>>> >as possible within the constraints of FP
>>>> >arithmetic.
>>>> >
>>>> >At 03:23 PM 2/3/2021, Henry Baker wrote:
>>>> >>Input: an nxn(*) matrix M of floating point
>>>> >>numbers (positive, negative or zero) which
>>>> >>is *supposed* to have rank exactly one, i.e.,
>>>> >>it is the outer product of two n-element
>>>> >>*unknown* vectors, and hence every row has
>>>> >>a common factor and every column has a common
>>>> >>factor, which factor can be positive, negative
>>>> >>or zero.
>>>> >>
>>>> >>(*) The matrix doesn't have to be square,
>>>> >>but it was easier to describe the square
>>>> >>version of the problem
>>>> >>
>>>> >>Unfortunately, due to floating point
>>>> >>approximations and and other issues,
>>>> >>the values of M have small errors, so
>>>> >>the columns and rows are not *exact*
>>>> >>multiples of their common factors, and
>>>> >>if the common factor is supposed to be
>>>> >>zero, it may actually be a number
>>>> >>exceedingly close to zero but at the
>>>> >>minimum exponent range.
>>>> >>
>>>> >>NB: Due to *underflow*, floating
>>>> >>point algebra has *zero divisors*,
>>>> >>i.e., there exist x/=0, y/=0, s.t.
>>>> >>x*y=0.
>>>> >>
>>>> >>Such a problem can trivially be posed
>>>> >>as an SVD problem, but I wanted a
>>>> >>more direct solution, because I wanted
>>>> >>to get the 'best' fit to the data within
>>>> >>the limits of the floating point
>>>> >>arithmetic.
>>>> >>
>>>> >>I have a quite elegant solution and
>>>> >>reasonably efficient solution, but
>>>> >>I wanted to see if anyone else had
>>>> >>the same idea.
>>>--
>>> Andy.Latto(a)pobox.com