New subject: [math-fun] Interesting matrix problem

7 Feb 2021

      Re: The 'textbook' solution using SVD's is looking better
all the time, as the SVD *automatically* finds the *closest*
rank-1 decomposition of the input matrix, which doesn't
even have to pretend to have been a rank-1 matrix.

Indeed, I tried a Jacobi-like SVD algorithm on some of
these purported rank-1 matrices, and it converges extremely
quickly because -- unlike non-rank-1 matrices -- each of
these Jacobi/Givens rotations zeros out entire rows and
columns and furthermore, these rotations don't screw up
the previously zeroed rows and columns.

A major downside to the SVD algorithm on these rank-1
matrices is that the row and column produced has been
*normalized* to 1, and the *scale factor*, the first
(the largest, and hopefully only non-zero) singular
value is *irrational*.

The other (non-'principal') singular values characterize
the 'noise', which might be helpful if you're trying
to reduce this noise.

At 12:39 PM 2/6/2021, Henry Baker wrote:
...
Oops!!
There still is a possibility of *worsening* the error if the noise
causes a sign change in one of the non-principal rows or columns.
A more robust solution would be to compute the correlation of
the other rows/columns with the principal rows/columns, and
choose the sign suggested by the correlation.
If the abs value of the correlation is too low, don't bother
including it in the sum, as it is 'obviously' all noise.
The 'textbook' solution using SVD's is looking better all
the time, as the SVD *automatically* finds the *closest*
rank-1 decomposition of the input matrix, which doesn't
even have to pretend to have been a rank-1 matrix.
At 09:34 AM 2/6/2021, Henry Baker wrote:
...
"Find the element in the matrix with the largest absolute value.
Use the row and column containing that element, and ignore
everything else."
The following idea might rescue another few bits from the noise:
Call the elt with the max abs value "maxelt", but maxelt keeps its
sign.
Call the row and column with "maxelt" the "principal row" and the
"principal" column. Let 'pi' be the row# for the principal row and
'pj' be the col# for the principal col. Note that maxelt = M[pi,pj].
Sum the rows as follows:
rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt)
Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]).
This makes the sign of the elt of the pj'th column of the i'th
row *match* the sign of maxelt, and thereby avoids cancellation.
Also, since we aren't scaling these other (smaller) rows up, we
won't be amplifying the noise.
Ditto for the column sums:
colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt)
Now use the row sums and column sums (suitably scaled) as our
answer.
At 08:37 PM 2/4/2021, Andy Latto wrote:
...
Given that we know that the error is largest for the smallest elements, and
all the columns are scalar multiples of any given row (and equivalently the
same for the columns), is this algorithm significantly better than:
Find the element in the matrix with the argest absolute value.
Use the row and column containing that element, and ignore everything else.
Andy
On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1@pipeline.com> wrote:
...
We first show how to solve the problem when M is the outer product of
*non-negative real* vectors.
Lemma 1.
Suppose we have a matrix M that purports to have the following structure:
[a b c d]' [p q r s] =
[ a p  a q  a r  a s ]
  [                    ]
  [ b p  b q  b r  b s ]
  [                    ]
  [ c p  c q  c r  c s ]
  [                    ]
  [ d p  d q  d r  d s ]
where >>> all of a,b,c,d,p,q,r,s >= 0 <<<
The row sums are
[a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q +
p)]
The column sums are
[(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a)
s]
The total matrix sum is
(d + c + b + a) (s + r + q + p)
If M is the zero matrix, then M is the outer product of 2 zero vectors.
If M isn't the zero matrix, then the row sum>0, the col sum>0, and the
total sum>0.
So we construct U' V = M as follows.
U = the row sums:
[a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q +
p)]
V = the column sums/(total sum):
p              q              r              s
[-------------, -------------, -------------, -------------]
 s + r + q + p  s + r + q + p  s + r + q + p  s + r + q + p
Or, we can split the scale factor (total sum) evenly:
U = row sums/sqrt(total sum):
a sqrt(s + r + q + p)  b sqrt(s + r + q + p)  c sqrt(s + r + q + p)  d
sqrt(s + r + q + p)
[---------------------, ---------------------, ---------------------,
---------------------]
  sqrt(d + c + b + a)    sqrt(d + c + b + a)    sqrt(d + c + b + a)
sqrt(d + c + b + a)
V = col sums/sqrt(total sum):
sqrt(d + c + b + a) p  sqrt(d + c + b + a) q  sqrt(d + c + b + a) r
sqrt(d + c + b + a) s
[---------------------, ---------------------, ---------------------,
---------------------]
  sqrt(s + r + q + p)    sqrt(s + r + q + p)    sqrt(s + r + q + p)
sqrt(s + r + q + p)
So these scaled row-sums and scaled col-sums are the vectors we seek.  QED
NB: It should now be clear why the row-sums and col-sums won't work when
we include negative elements; e.g., there exist matrices M whose row-sums
and col-sums are both *zero*, so the procedure above fails completely.
---
Now we handle the general case, where M consists of both positive and
negative
entries.
We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M),
where abs(M) is M with all of its entries replaced by their absolute
values.
Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the
following scheme.
Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending
order of absolute value.
The largest element of M is the product of the largest element of U and the
largest element of V.  Assume that this element is |b|*|r| = |M[i,j]|.
We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that
the sign(b)*sign(r)=sign(M[i,j]).
Since this is the first assignment, we can arbitrarily choose sign(b)=+1,
and then set sign(r)=sign(M[i,j]).
We then process each of the other elements of U and V in descending order
of
absolute values, choosing signs for the remaining elements of U and the
remaining elements of V so they are *consistent* with the previous choices
of signs.
If any M[i,j]=0, it has no effect on any choice of signs for elements of
U,V.
Due to round-off or other errors, the assignment near the end of this
process --
i.e., when the elements of U and V are very close to zero, and we may find
inconsistencies.  However, these inconsistencies are extremely small
(assuming
that the input matrix was indeed produced as an outer product), and by
processing the elements of U,V in descending order of absolute value, we
ensure that the signs are consistent for the *strongest signals*, so that
any inconsistencies are due to small round-off errors.
At 05:02 PM 2/3/2021, Henry Baker wrote:
...
Dan asks: what is the question?
Find the (real) vectors U,V whose outer product
makes the (real) matrix; i.e.,
U' V = M
Clearly, for any real alpha,
(U/alpha)' (alpha*V) = M
so the vectors are determined only up to
a single constant factor.
Once again, the '=' wants to be as close
as possible within the constraints of FP
arithmetic.
At 03:23 PM 2/3/2021, Henry Baker wrote:
...
Input: an nxn(*) matrix M of floating point
numbers (positive, negative or zero) which
is *supposed* to have rank exactly one, i.e.,
it is the outer product of two n-element
*unknown* vectors, and hence every row has
a common factor and every column has a common
factor, which factor can be positive, negative
or zero.
(*) The matrix doesn't have to be square,
but it was easier to describe the square
version of the problem
Unfortunately, due to floating point
approximations and and other issues,
the values of M have small errors, so
the columns and rows are not *exact*
multiples of their common factors, and
if the common factor is supposed to be
zero, it may actually be a number
exceedingly close to zero but at the
minimum exponent range.
NB: Due to *underflow*, floating
point algebra has *zero divisors*,
i.e., there exist x/=0, y/=0, s.t.
x*y=0.
Such a problem can trivially be posed
as an SVD problem, but I wanted a
more direct solution, because I wanted
to get the 'best' fit to the data within
the limits of the floating point
arithmetic.
I have a quite elegant solution and
reasonably efficient solution, but
I wanted to see if anyone else had
the same idea.
-- 
                Andy.Latto@pobox.com

Re: [math-fun] Interesting matrix problem

Henry Baker

Maximilian Hasler

tags

participants (2)