- mathfun@mailman.xmission.com - mailman.xmission.com

[math-fun] Question about summation of 1/P(n) where P(x) is an integer polynomial
by Dan Asimov 14 Feb '21

14 Feb '21

Suppose P(x) is a polynomial with integer coefficients. Assume that no positive integer is a root of P(x). Is it possible that Sum 1/P(n) = 1 where n = 1, 2, 3, ..., runs through the positive integers? —Dan

2 1

[math-fun] Machines Are Inventing New Math We've Never Seen
by Ray Tayek 11 Feb '21

11 Feb '21

https://science.slashdot.org/story/21/02/10/2056233/machines-are-inventing-… an anonymous reader quotes a report from Motherboard: /[A] group of researchers from the Technion in Israel and Google in Tel Aviv presented an automated conjecturing system that they call the Ramanujan Machine, named after the mathematician Srinivasa Ramanujan, who developed thousands of innovative formulas in number theory with almost no formal training. The software system has already conjectured several original and important formulas for universal constants that show up in mathematics <https://www.vice.com/en/article/xgzkek/machines-are-inventing-new-math-weve…>. The work was published last week in Nature <https://www.nature.com/articles/s41586-021-03229-4?utm_medium=affiliate&utm…>. One of the formulas created by the Machine can be used to compute the value of a universal constant called Catalan's number more efficiently than any previous human-discovered formulas. But the Ramanujan Machine is imagined not to take over mathematics, so much as provide a sort of feeding line for existing mathematicians. As the researchers explain in the paper, the entire discipline of mathematics can be broken down into two processes, crudely speaking: conjecturing things and proving things. Given more conjectures, there is more grist for the mill of the mathematical mind, more for mathematicians to prove and explain. That's not to say their system is unambitious. As the researchers put it, the Ramanujan Machine is "trying to replace the mathematical intuition of great mathematicians and providing leads to further mathematical research." In particular, the researchers' system produces conjectures for the value of universal constants (like pi), written in terms of elegant formulas called continued fractions. Continued fractions are essentially fractions, but more dizzying. The denominator in a continued fraction includes a sum of two terms, the second of which is itself a fraction, whose denominator itself contains a fraction, and so on, out to infinity. The Ramanujan Machine is built off of two primary algorithms. These find continued fraction expressions that, with a high degree of confidence, seem to equal universal constants. That confidence is important, as otherwise, the conjectures would be easily discarded and provide little value. Each conjecture takes the form of an equation. The idea is that the quantity on the left side of the equals sign, a formula involving a universal constant, should be equal to the quantity on the right, a continued fraction. To get to these conjectures, the algorithm picks arbitrary universal constants for the left side and arbitrary continued fractions for the right, and then computes each side separately to a certain precision. If the two sides appear to align, the quantities are calculated to higher precision to make sure their alignment is not a coincidence of imprecision. Critically, formulas already exist to compute the value of universal constants like pi to an arbitrary precision, so that the only obstacle to verifying the sides match is computing time./ -- Honesty is a very expensive gift. So, don't expect it from cheap people - Warren Buffett http://tayek.com/

2 1

[math-fun] Rational points on real Fermat curves
by Dan Asimov 08 Feb '21

08 Feb '21

The equation in Fermat's Last Theorem, K^p + L^p = M^p can be transformed by dividing by M^p into (*) x^p + y^p = 1 where here x and y each lies between 0 and 1, and we'll let p be any positive real number. For fixed p, denote the locus of (*) in the unit square [0,1]x[0,1] by C_p. Then the curves {C_p | 0 < p < oo} continuously fill up the open square (0,1)x(0,1) and their endpoints lie on the boundary of the square. Questions: ---------- Let Q^2 denote the rational points in (0,1)x(0,1). 1) For which p > 0 is the intersection X_p = C_p \int Q^2 nonempty? Denote this set of p by N. (We know that p = 1 and p = 2 are included, and by FLT, no other positive integer.) 2) For which among p in N is X_p a dense subset of C_p ? Denote this subset of N by D. (We know that p = 1 and p = 2 are included.) 3) What other kinds of intersection sets X_p, if any, can occur for p in N-D ? —Dan Let p be a real number. The equation

1 0

Re: [math-fun] Interesting matrix problem
by Henry Baker 08 Feb '21

08 Feb '21

Re: The 'textbook' solution using SVD's is looking better all the time, as the SVD *automatically* finds the *closest* rank-1 decomposition of the input matrix, which doesn't even have to pretend to have been a rank-1 matrix. Indeed, I tried a Jacobi-like SVD algorithm on some of these purported rank-1 matrices, and it converges extremely quickly because -- unlike non-rank-1 matrices -- each of these Jacobi/Givens rotations zeros out entire rows and columns and furthermore, these rotations don't screw up the previously zeroed rows and columns. A major downside to the SVD algorithm on these rank-1 matrices is that the row and column produced has been *normalized* to 1, and the *scale factor*, the first (the largest, and hopefully only non-zero) singular value is *irrational*. The other (non-'principal') singular values characterize the 'noise', which might be helpful if you're trying to reduce this noise. At 12:39 PM 2/6/2021, Henry Baker wrote: >Oops!! > >There still is a possibility of *worsening* the error if the noise >causes a sign change in one of the non-principal rows or columns. > >A more robust solution would be to compute the correlation of >the other rows/columns with the principal rows/columns, and >choose the sign suggested by the correlation. > >If the abs value of the correlation is too low, don't bother >including it in the sum, as it is 'obviously' all noise. > >The 'textbook' solution using SVD's is looking better all >the time, as the SVD *automatically* finds the *closest* >rank-1 decomposition of the input matrix, which doesn't >even have to pretend to have been a rank-1 matrix. > >At 09:34 AM 2/6/2021, Henry Baker wrote: >>"Find the element in the matrix with the largest absolute value. >>Use the row and column containing that element, and ignore >>everything else." >> >>The following idea might rescue another few bits from the noise: >> >>Call the elt with the max abs value "maxelt", but maxelt keeps its >>sign. >> >>Call the row and column with "maxelt" the "principal row" and the >>"principal" column. Let 'pi' be the row# for the principal row and >>'pj' be the col# for the principal col. Note that maxelt = M[pi,pj]. >> >>Sum the rows as follows: >> >>rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt) >> >>Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]). >> >>This makes the sign of the elt of the pj'th column of the i'th >>row *match* the sign of maxelt, and thereby avoids cancellation. >> >>Also, since we aren't scaling these other (smaller) rows up, we >>won't be amplifying the noise. >> >>Ditto for the column sums: >> >>colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt) >> >>Now use the row sums and column sums (suitably scaled) as our >>answer. >> >>At 08:37 PM 2/4/2021, Andy Latto wrote: >>>Given that we know that the error is largest for the smallest elements, and >>>all the columns are scalar multiples of any given row (and equivalently the >>>same for the columns), is this algorithm significantly better than: >>> >>>Find the element in the matrix with the argest absolute value. >>> >>>Use the row and column containing that element, and ignore everything else. >>> >>>Andy >>> >>>On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1(a)pipeline.com> wrote: >>>> We first show how to solve the problem when M is the outer product of >>>> *non-negative real* vectors. >>>> >>>> Lemma 1. >>>> >>>> Suppose we have a matrix M that purports to have the following structure: >>>> >>>> [a b c d]' [p q r s] = >>>> >>>> [ a p a q a r a s ] >>>> [ ] >>>> [ b p b q b r b s ] >>>> [ ] >>>> [ c p c q c r c s ] >>>> [ ] >>>> [ d p d q d r d s ] >>>> >>>> where >>> all of a,b,c,d,p,q,r,s >= 0 <<< >>>> >>>> The row sums are >>>> >>>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >>>> p)] >>>> >>>> The column sums are >>>> >>>> [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) >>>> s] >>>> >>>> The total matrix sum is >>>> >>>> (d + c + b + a) (s + r + q + p) >>>> >>>> If M is the zero matrix, then M is the outer product of 2 zero vectors. >>>> >>>> If M isn't the zero matrix, then the row sum>0, the col sum>0, and the >>>> total sum>0. >>>> >>>> So we construct U' V = M as follows. >>>> >>>> U = the row sums: >>>> >>>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >>>> p)] >>>> >>>> V = the column sums/(total sum): >>>> >>>> p q r s >>>> [-------------, -------------, -------------, -------------] >>>> s + r + q + p s + r + q + p s + r + q + p s + r + q + p >>>> >>>> Or, we can split the scale factor (total sum) evenly: >>>> >>>> U = row sums/sqrt(total sum): >>>> >>>> a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d >>>> sqrt(s + r + q + p) >>>> [---------------------, ---------------------, ---------------------, >>>> ---------------------] >>>> sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) >>>> sqrt(d + c + b + a) >>>> >>>> V = col sums/sqrt(total sum): >>>> >>>> sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r >>>> sqrt(d + c + b + a) s >>>> [---------------------, ---------------------, ---------------------, >>>> ---------------------] >>>> sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) >>>> sqrt(s + r + q + p) >>>> >>>> So these scaled row-sums and scaled col-sums are the vectors we seek. QED >>>> >>>> NB: It should now be clear why the row-sums and col-sums won't work when >>>> we include negative elements; e.g., there exist matrices M whose row-sums >>>> and col-sums are both *zero*, so the procedure above fails completely. >>>> --- >>>> >>>> Now we handle the general case, where M consists of both positive and >>>> negative >>>> entries. >>>> >>>> We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), >>>> where abs(M) is M with all of its entries replaced by their absolute >>>> values. >>>> >>>> Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the >>>> following scheme. >>>> >>>> Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending >>>> order of absolute value. >>>> >>>> The largest element of M is the product of the largest element of U and the >>>> largest element of V. Assume that this element is |b|*|r| = |M[i,j]|. >>>> >>>> We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that >>>> the sign(b)*sign(r)=sign(M[i,j]). >>>> >>>> Since this is the first assignment, we can arbitrarily choose sign(b)=+1, >>>> and then set sign(r)=sign(M[i,j]). >>>> >>>> We then process each of the other elements of U and V in descending order >>>> of >>>> absolute values, choosing signs for the remaining elements of U and the >>>> remaining elements of V so they are *consistent* with the previous choices >>>> of signs. >>>> >>>> If any M[i,j]=0, it has no effect on any choice of signs for elements of >>>> U,V. >>>> >>>> Due to round-off or other errors, the assignment near the end of this >>>> process -- >>>> i.e., when the elements of U and V are very close to zero, and we may find >>>> inconsistencies. However, these inconsistencies are extremely small >>>> (assuming >>>> that the input matrix was indeed produced as an outer product), and by >>>> processing the elements of U,V in descending order of absolute value, we >>>> ensure that the signs are consistent for the *strongest signals*, so that >>>> any inconsistencies are due to small round-off errors. >>>> >>>> At 05:02 PM 2/3/2021, Henry Baker wrote: >>>> >Dan asks: what is the question? >>>> > >>>> >Find the (real) vectors U,V whose outer product >>>> >makes the (real) matrix; i.e., >>>> > >>>> >U' V = M >>>> > >>>> >Clearly, for any real alpha, >>>> > >>>> >(U/alpha)' (alpha*V) = M >>>> > >>>> >so the vectors are determined only up to >>>> >a single constant factor. >>>> > >>>> >Once again, the '=' wants to be as close >>>> >as possible within the constraints of FP >>>> >arithmetic. >>>> > >>>> >At 03:23 PM 2/3/2021, Henry Baker wrote: >>>> >>Input: an nxn(*) matrix M of floating point >>>> >>numbers (positive, negative or zero) which >>>> >>is *supposed* to have rank exactly one, i.e., >>>> >>it is the outer product of two n-element >>>> >>*unknown* vectors, and hence every row has >>>> >>a common factor and every column has a common >>>> >>factor, which factor can be positive, negative >>>> >>or zero. >>>> >> >>>> >>(*) The matrix doesn't have to be square, >>>> >>but it was easier to describe the square >>>> >>version of the problem >>>> >> >>>> >>Unfortunately, due to floating point >>>> >>approximations and and other issues, >>>> >>the values of M have small errors, so >>>> >>the columns and rows are not *exact* >>>> >>multiples of their common factors, and >>>> >>if the common factor is supposed to be >>>> >>zero, it may actually be a number >>>> >>exceedingly close to zero but at the >>>> >>minimum exponent range. >>>> >> >>>> >>NB: Due to *underflow*, floating >>>> >>point algebra has *zero divisors*, >>>> >>i.e., there exist x/=0, y/=0, s.t. >>>> >>x*y=0. >>>> >> >>>> >>Such a problem can trivially be posed >>>> >>as an SVD problem, but I wanted a >>>> >>more direct solution, because I wanted >>>> >>to get the 'best' fit to the data within >>>> >>the limits of the floating point >>>> >>arithmetic. >>>> >> >>>> >>I have a quite elegant solution and >>>> >>reasonably efficient solution, but >>>> >>I wanted to see if anyone else had >>>> >>the same idea. >>>-- >>> Andy.Latto(a)pobox.com

2 1

Re: [math-fun] Interesting matrix problem
by Keith F. Lynch 06 Feb '21

06 Feb '21

The discussion of noisy matrices reminded me of the following: About 40 years ago, I had to raise e to the power of various square matrices. This was as a part of a simulation of a dynamical system. Something sort of like pendulums hanging from other pendulums. I decided that the best way to do so, given the small amount of computer power available in those days, was to divide the matrix by 2^n, add the identity matrix to it, then square it n times. But that raises the question of what n should be. If I recall correctly, I had it try lots of values of n, and choose the value that was, in some sense, closest to those generated by n-1 and n+1. Was that a sound decision? What, if anything, is known about this algorithm? Is it original with me? I just tried it for a 1x1 matrix whose value is 1. The correct solution is of course e. Here's single precision: 0 2.00000000000000000000 1 2.25000000000000000000 2 2.44140625000000000000 3 2.56578445434570312500 4 2.63792848587036132812 5 2.67699003219604492188 6 2.69734191894531250000 7 2.70773959159851074219 8 2.71299004554748535156 9 2.71562004089355468750 10 2.71694374084472656250 11 2.71759772300720214844 [same value as above and below for 12 through 22] 23 2.71759772300720214844 24 1.00000000000000000000 [same value as above for 25 and forever] And double precision: 0 2.00000000000000000000 1 2.25000000000000000000 2 2.44140625000000000000 3 2.56578451395034790039 4 2.63792849736659995585 5 2.67699012937818237035 6 2.69734495256509987371 7 2.70773901968801977702 8 2.71299162425344198013 9 2.71563200016899486400 10 2.71695572946642727175 11 2.71761848233683567244 12 2.71795008118962977406 13 2.71811593626604652840 14 2.71819887772164303641 15 2.71824035193003421540 16 2.71826108990387993458 17 2.71827145910837941756 18 2.71827664376668298729 19 2.71827923611850286179 20 2.71828053227565646921 21 2.71828118036402610613 22 2.71828150439858440279 23 2.71828166642075297332 24 2.71828174742680062081 25 2.71828178793237640321 26 2.71828180818247311379 [same value as above and below for 27 through 51] 52 2.71828180818247311379 53 1.00000000000000000000 [same value as above for 54 and forever] I just tried it with a pocket calculator. Again I get duplicate values: 0 2 1 2.25 2 2.4414063 3 2.5657845 4 2.6379285 5 2.6769901 6 2.6873448 7 2.7077389 8 2.7129912 9 2.7156311 10 2.7169527 11 2.7176153 12 2.7179375 13 2.7180927 14 2.7181593 15 2.7181148 16 2.7180261 17 2.7180261 18 2.7176706 19 2.7176706 20 2.7162475 21 2.7134009 22 2.7134009 23 2.7134009 24 2.6907395 25 2.6459824 26 2.558681 27 1 28 1 For reference, here is e to 1000 places: 2.718281828459045235360287471352662497757247093699959574966967627724076630353547594571382178525166427427466391932003059921817413596629043572900334295260595630738132328627943490763233829880753195251019011573834187930702154089149934884167509244761460668082264800168477411853742345442437107539077744992069551702761838606261331384583000752044933826560297606737113200709328709127443747047230696977209310141692836819025515108657463772111252389784425056953696770785449969967946864454905987931636889230098793127736178215424999229576351482208269895193668033182528869398496465105820939239829488793320362509443117301238197068416140397019837679320683282376464804295311802328782509819455815301756717361332069811250996181881593041690351598888519345807273866738589422879228499892086805825749279610484198444363463244968487560233624827041978623209002160990235304369941849146314093431738143640546253152096183690888707016768396424378140592714563549061303107208510383750510115747704171898610687396965521267154688957035 0354 And here's the three-line C program I wrote to generate it: void main(){int j,k,l[0xBAD];for(j=0xABC;j;j--){for(k=0;k<0x3E9;k++){l[k+1]+= 5*(l[k]%j)<<1;l[k]/=j;}l[0]++;}for(j=0;j<01751;j++){printf("%d",l[j]);if(!j) printf("%c",46);}printf("\n");} I'm reminded of the problem of finding the best diameter of the pinhole in a pinhole camera (assuming you're only concerned with resolution, not with light gathering power). There the solution is simply the geometric mean between the wavelength of light and the focal length of the camera. I wonder if the solution to finding the best n is analogous. Maybe it's the n that generates the geometric mean of 1 and epsilon, where epsilon is the smallest number which can be added to 1 to get a number distinct from 1. In other words, the square root of epsilon. But with a non-trivial matrix, that will vary among the matrix elements.

2 1

Re: [math-fun] Interesting matrix problem
by Henry Baker 06 Feb '21

06 Feb '21

Oops!! There still is a possibility of *worsening* the error if the noise causes a sign change in one of the non-principal rows or columns. A more robust solution would be to compute the correlation of the other rows/columns with the principal rows/columns, and choose the sign suggested by the correlation. If the abs value of the correlation is too low, don't bother including it in the sum, as it is 'obviously' all noise. The 'textbook' solution using SVD's is looking better all the time, as the SVD *automatically* finds the *closest* rank-1 decomposition of the input matrix, which doesn't even have to pretend to have been a rank-1 matrix. At 09:34 AM 2/6/2021, Henry Baker wrote: >"Find the element in the matrix with the largest absolute value. >Use the row and column containing that element, and ignore >everything else." > >The following idea might rescue another few bits from the noise: > >Call the elt with the max abs value "maxelt", but maxelt keeps its >sign. > >Call the row and column with "maxelt" the "principal row" and the >"principal" column. Let 'pi' be the row# for the principal row and >'pj' be the col# for the principal col. Note that maxelt = M[pi,pj]. > >Sum the rows as follows: > >rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt) > >Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]). > >This makes the sign of the elt of the pj'th column of the i'th >row *match* the sign of maxelt, and thereby avoids cancellation. > >Also, since we aren't scaling these other (smaller) rows up, we >won't be amplifying the noise. > >Ditto for the column sums: > >colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt) > >Now use the row sums and column sums (suitably scaled) as our >answer. > >At 08:37 PM 2/4/2021, Andy Latto wrote: >>Given that we know that the error is largest for the smallest elements, and >>all the columns are scalar multiples of any given row (and equivalently the >>same for the columns), is this algorithm significantly better than: >> >>Find the element in the matrix with the argest absolute value. >> >>Use the row and column containing that element, and ignore everything else. >> >>Andy >> >>On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1(a)pipeline.com> wrote: >>> We first show how to solve the problem when M is the outer product of >>> *non-negative real* vectors. >>> >>> Lemma 1. >>> >>> Suppose we have a matrix M that purports to have the following structure: >>> >>> [a b c d]' [p q r s] = >>> >>> [ a p a q a r a s ] >>> [ ] >>> [ b p b q b r b s ] >>> [ ] >>> [ c p c q c r c s ] >>> [ ] >>> [ d p d q d r d s ] >>> >>> where >>> all of a,b,c,d,p,q,r,s >= 0 <<< >>> >>> The row sums are >>> >>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >>> p)] >>> >>> The column sums are >>> >>> [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) >>> s] >>> >>> The total matrix sum is >>> >>> (d + c + b + a) (s + r + q + p) >>> >>> If M is the zero matrix, then M is the outer product of 2 zero vectors. >>> >>> If M isn't the zero matrix, then the row sum>0, the col sum>0, and the >>> total sum>0. >>> >>> So we construct U' V = M as follows. >>> >>> U = the row sums: >>> >>> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >>> p)] >>> >>> V = the column sums/(total sum): >>> >>> p q r s >>> [-------------, -------------, -------------, -------------] >>> s + r + q + p s + r + q + p s + r + q + p s + r + q + p >>> >>> Or, we can split the scale factor (total sum) evenly: >>> >>> U = row sums/sqrt(total sum): >>> >>> a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d >>> sqrt(s + r + q + p) >>> [---------------------, ---------------------, ---------------------, >>> ---------------------] >>> sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) >>> sqrt(d + c + b + a) >>> >>> V = col sums/sqrt(total sum): >>> >>> sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r >>> sqrt(d + c + b + a) s >>> [---------------------, ---------------------, ---------------------, >>> ---------------------] >>> sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) >>> sqrt(s + r + q + p) >>> >>> So these scaled row-sums and scaled col-sums are the vectors we seek. QED >>> >>> NB: It should now be clear why the row-sums and col-sums won't work when >>> we include negative elements; e.g., there exist matrices M whose row-sums >>> and col-sums are both *zero*, so the procedure above fails completely. >>> --- >>> >>> Now we handle the general case, where M consists of both positive and >>> negative >>> entries. >>> >>> We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), >>> where abs(M) is M with all of its entries replaced by their absolute >>> values. >>> >>> Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the >>> following scheme. >>> >>> Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending >>> order of absolute value. >>> >>> The largest element of M is the product of the largest element of U and the >>> largest element of V. Assume that this element is |b|*|r| = |M[i,j]|. >>> >>> We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that >>> the sign(b)*sign(r)=sign(M[i,j]). >>> >>> Since this is the first assignment, we can arbitrarily choose sign(b)=+1, >>> and then set sign(r)=sign(M[i,j]). >>> >>> We then process each of the other elements of U and V in descending order >>> of >>> absolute values, choosing signs for the remaining elements of U and the >>> remaining elements of V so they are *consistent* with the previous choices >>> of signs. >>> >>> If any M[i,j]=0, it has no effect on any choice of signs for elements of >>> U,V. >>> >>> Due to round-off or other errors, the assignment near the end of this >>> process -- >>> i.e., when the elements of U and V are very close to zero, and we may find >>> inconsistencies. However, these inconsistencies are extremely small >>> (assuming >>> that the input matrix was indeed produced as an outer product), and by >>> processing the elements of U,V in descending order of absolute value, we >>> ensure that the signs are consistent for the *strongest signals*, so that >>> any inconsistencies are due to small round-off errors. >>> >>> At 05:02 PM 2/3/2021, Henry Baker wrote: >>> >Dan asks: what is the question? >>> > >>> >Find the (real) vectors U,V whose outer product >>> >makes the (real) matrix; i.e., >>> > >>> >U' V = M >>> > >>> >Clearly, for any real alpha, >>> > >>> >(U/alpha)' (alpha*V) = M >>> > >>> >so the vectors are determined only up to >>> >a single constant factor. >>> > >>> >Once again, the '=' wants to be as close >>> >as possible within the constraints of FP >>> >arithmetic. >>> > >>> >At 03:23 PM 2/3/2021, Henry Baker wrote: >>> >>Input: an nxn(*) matrix M of floating point >>> >>numbers (positive, negative or zero) which >>> >>is *supposed* to have rank exactly one, i.e., >>> >>it is the outer product of two n-element >>> >>*unknown* vectors, and hence every row has >>> >>a common factor and every column has a common >>> >>factor, which factor can be positive, negative >>> >>or zero. >>> >> >>> >>(*) The matrix doesn't have to be square, >>> >>but it was easier to describe the square >>> >>version of the problem >>> >> >>> >>Unfortunately, due to floating point >>> >>approximations and and other issues, >>> >>the values of M have small errors, so >>> >>the columns and rows are not *exact* >>> >>multiples of their common factors, and >>> >>if the common factor is supposed to be >>> >>zero, it may actually be a number >>> >>exceedingly close to zero but at the >>> >>minimum exponent range. >>> >> >>> >>NB: Due to *underflow*, floating >>> >>point algebra has *zero divisors*, >>> >>i.e., there exist x/=0, y/=0, s.t. >>> >>x*y=0. >>> >> >>> >>Such a problem can trivially be posed >>> >>as an SVD problem, but I wanted a >>> >>more direct solution, because I wanted >>> >>to get the 'best' fit to the data within >>> >>the limits of the floating point >>> >>arithmetic. >>> >> >>> >>I have a quite elegant solution and >>> >>reasonably efficient solution, but >>> >>I wanted to see if anyone else had >>> >>the same idea. >>-- >> Andy.Latto(a)pobox.com

1 0

Re: [math-fun] Interesting matrix problem
by Henry Baker 06 Feb '21

06 Feb '21

"Find the element in the matrix with the largest absolute value. Use the row and column containing that element, and ignore everything else." The following idea might rescue another few bits from the noise: Call the elt with the max abs value "maxelt", but maxelt keeps its sign. Call the row and column with "maxelt" the "principal row" and the "principal" column. Let 'pi' be the row# for the principal row and 'pj' be the col# for the principal col. Note that maxelt = M[pi,pj]. Sum the rows as follows: rowsums : rowsums + row[M,i]*signum(M[i,pj])*signum(maxelt) Note that M[i,pj]*signum(M[i,pj]) = abs(M[i,pj]). This makes the sign of the elt of the pj'th column of the i'th row *match* the sign of maxelt, and thereby avoids cancellation. Also, since we aren't scaling these other (smaller) rows up, we won't be amplifying the noise. Ditto for the column sums: colsums : colsums + col[M,j]*signum(M[pi,j])*signum(maxelt) Now use the row sums and column sums (suitably scaled) as our answer. At 08:37 PM 2/4/2021, Andy Latto wrote: >Given that we know that the error is largest for the smallest elements, and >all the columns are scalar multiples of any given row (and equivalently the >same for the columns), is this algorithm significantly better than: > >Find the element in the matrix with the argest absolute value. > >Use the row and column containing that element, and ignore everything else. > >Andy > >On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1(a)pipeline.com> wrote: >> We first show how to solve the problem when M is the outer product of >> *non-negative real* vectors. >> >> Lemma 1. >> >> Suppose we have a matrix M that purports to have the following structure: >> >> [a b c d]' [p q r s] = >> >> [ a p a q a r a s ] >> [ ] >> [ b p b q b r b s ] >> [ ] >> [ c p c q c r c s ] >> [ ] >> [ d p d q d r d s ] >> >> where >>> all of a,b,c,d,p,q,r,s >= 0 <<< >> >> The row sums are >> >> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >> p)] >> >> The column sums are >> >> [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) >> s] >> >> The total matrix sum is >> >> (d + c + b + a) (s + r + q + p) >> >> If M is the zero matrix, then M is the outer product of 2 zero vectors. >> >> If M isn't the zero matrix, then the row sum>0, the col sum>0, and the >> total sum>0. >> >> So we construct U' V = M as follows. >> >> U = the row sums: >> >> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >> p)] >> >> V = the column sums/(total sum): >> >> p q r s >> [-------------, -------------, -------------, -------------] >> s + r + q + p s + r + q + p s + r + q + p s + r + q + p >> >> Or, we can split the scale factor (total sum) evenly: >> >> U = row sums/sqrt(total sum): >> >> a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d >> sqrt(s + r + q + p) >> [---------------------, ---------------------, ---------------------, >> ---------------------] >> sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) >> sqrt(d + c + b + a) >> >> V = col sums/sqrt(total sum): >> >> sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r >> sqrt(d + c + b + a) s >> [---------------------, ---------------------, ---------------------, >> ---------------------] >> sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) >> sqrt(s + r + q + p) >> >> So these scaled row-sums and scaled col-sums are the vectors we seek. QED >> >> NB: It should now be clear why the row-sums and col-sums won't work when >> we include negative elements; e.g., there exist matrices M whose row-sums >> and col-sums are both *zero*, so the procedure above fails completely. >> --- >> >> Now we handle the general case, where M consists of both positive and >> negative >> entries. >> >> We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), >> where abs(M) is M with all of its entries replaced by their absolute >> values. >> >> Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the >> following scheme. >> >> Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending >> order of absolute value. >> >> The largest element of M is the product of the largest element of U and the >> largest element of V. Assume that this element is |b|*|r| = |M[i,j]|. >> >> We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that >> the sign(b)*sign(r)=sign(M[i,j]). >> >> Since this is the first assignment, we can arbitrarily choose sign(b)=+1, >> and then set sign(r)=sign(M[i,j]). >> >> We then process each of the other elements of U and V in descending order >> of >> absolute values, choosing signs for the remaining elements of U and the >> remaining elements of V so they are *consistent* with the previous choices >> of signs. >> >> If any M[i,j]=0, it has no effect on any choice of signs for elements of >> U,V. >> >> Due to round-off or other errors, the assignment near the end of this >> process -- >> i.e., when the elements of U and V are very close to zero, and we may find >> inconsistencies. However, these inconsistencies are extremely small >> (assuming >> that the input matrix was indeed produced as an outer product), and by >> processing the elements of U,V in descending order of absolute value, we >> ensure that the signs are consistent for the *strongest signals*, so that >> any inconsistencies are due to small round-off errors. >> >> At 05:02 PM 2/3/2021, Henry Baker wrote: >> >Dan asks: what is the question? >> > >> >Find the (real) vectors U,V whose outer product >> >makes the (real) matrix; i.e., >> > >> >U' V = M >> > >> >Clearly, for any real alpha, >> > >> >(U/alpha)' (alpha*V) = M >> > >> >so the vectors are determined only up to >> >a single constant factor. >> > >> >Once again, the '=' wants to be as close >> >as possible within the constraints of FP >> >arithmetic. >> > >> >At 03:23 PM 2/3/2021, Henry Baker wrote: >> >>Input: an nxn(*) matrix M of floating point >> >>numbers (positive, negative or zero) which >> >>is *supposed* to have rank exactly one, i.e., >> >>it is the outer product of two n-element >> >>*unknown* vectors, and hence every row has >> >>a common factor and every column has a common >> >>factor, which factor can be positive, negative >> >>or zero. >> >> >> >>(*) The matrix doesn't have to be square, >> >>but it was easier to describe the square >> >>version of the problem >> >> >> >>Unfortunately, due to floating point >> >>approximations and and other issues, >> >>the values of M have small errors, so >> >>the columns and rows are not *exact* >> >>multiples of their common factors, and >> >>if the common factor is supposed to be >> >>zero, it may actually be a number >> >>exceedingly close to zero but at the >> >>minimum exponent range. >> >> >> >>NB: Due to *underflow*, floating >> >>point algebra has *zero divisors*, >> >>i.e., there exist x/=0, y/=0, s.t. >> >>x*y=0. >> >> >> >>Such a problem can trivially be posed >> >>as an SVD problem, but I wanted a >> >>more direct solution, because I wanted >> >>to get the 'best' fit to the data within >> >>the limits of the floating point >> >>arithmetic. >> >> >> >>I have a quite elegant solution and >> >>reasonably efficient solution, but >> >>I wanted to see if anyone else had >> >>the same idea. >-- > Andy.Latto(a)pobox.com

1 0

Re: [math-fun] Interesting matrix problem
by Henry Baker 05 Feb '21

05 Feb '21

Andy's suggestion works great! I was trying to be too clever by half and somehow 'average out' any tiny errors. However, any attempt to pull signal from the noise in the rows or columns with small norms just ends up *magnifying* that noise! So Andy's suggestion is not only trivial to implement, it is actually more robust than any of my attempts at greater cleverness. Andy's solution is also *rational*, which was one of my lesser desiderata. At 08:37 PM 2/4/2021, Andy Latto wrote: >Given that we know that the error is largest for the smallest elements, and >all the columns are scalar multiples of any given row (and equivalently the >same for the columns), is this algorithm significantly better than: > >Find the element in the matrix with the argest absolute value. > >Use the row and column containing that element, and ignore everything else. > >Andy > >On Thu, Feb 4, 2021 at 9:21 PM Henry Baker <hbaker1(a)pipeline.com> wrote: >> We first show how to solve the problem when M is the outer product of >> *non-negative real* vectors. >> >> Lemma 1. >> >> Suppose we have a matrix M that purports to have the following structure: >> >> [a b c d]' [p q r s] = >> >> [ a p a q a r a s ] >> [ ] >> [ b p b q b r b s ] >> [ ] >> [ c p c q c r c s ] >> [ ] >> [ d p d q d r d s ] >> >> where >>> all of a,b,c,d,p,q,r,s >= 0 <<< >> >> The row sums are >> >> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >> p)] >> >> The column sums are >> >> [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) >> s] >> >> The total matrix sum is >> >> (d + c + b + a) (s + r + q + p) >> >> If M is the zero matrix, then M is the outer product of 2 zero vectors. >> >> If M isn't the zero matrix, then the row sum>0, the col sum>0, and the >> total sum>0. >> >> So we construct U' V = M as follows. >> >> U = the row sums: >> >> [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + >> p)] >> >> V = the column sums/(total sum): >> >> p q r s >> [-------------, -------------, -------------, -------------] >> s + r + q + p s + r + q + p s + r + q + p s + r + q + p >> >> Or, we can split the scale factor (total sum) evenly: >> >> U = row sums/sqrt(total sum): >> >> a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d >> sqrt(s + r + q + p) >> [---------------------, ---------------------, ---------------------, >> ---------------------] >> sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) >> sqrt(d + c + b + a) >> >> V = col sums/sqrt(total sum): >> >> sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r >> sqrt(d + c + b + a) s >> [---------------------, ---------------------, ---------------------, >> ---------------------] >> sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) >> sqrt(s + r + q + p) >> >> So these scaled row-sums and scaled col-sums are the vectors we seek. QED >> >> NB: It should now be clear why the row-sums and col-sums won't work when >> we include negative elements; e.g., there exist matrices M whose row-sums >> and col-sums are both *zero*, so the procedure above fails completely. >> --- >> >> Now we handle the general case, where M consists of both positive and >> negative >> entries. >> >> We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), >> where abs(M) is M with all of its entries replaced by their absolute >> values. >> >> Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the >> following scheme. >> >> Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending >> order of absolute value. >> >> The largest element of M is the product of the largest element of U and the >> largest element of V. Assume that this element is |b|*|r| = |M[i,j]|. >> >> We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that >> the sign(b)*sign(r)=sign(M[i,j]). >> >> Since this is the first assignment, we can arbitrarily choose sign(b)=+1, >> and then set sign(r)=sign(M[i,j]). >> >> We then process each of the other elements of U and V in descending order >> of >> absolute values, choosing signs for the remaining elements of U and the >> remaining elements of V so they are *consistent* with the previous choices >> of signs. >> >> If any M[i,j]=0, it has no effect on any choice of signs for elements of >> U,V. >> >> Due to round-off or other errors, the assignment near the end of this >> process -- >> i.e., when the elements of U and V are very close to zero, and we may find >> inconsistencies. However, these inconsistencies are extremely small >> (assuming >> that the input matrix was indeed produced as an outer product), and by >> processing the elements of U,V in descending order of absolute value, we >> ensure that the signs are consistent for the *strongest signals*, so that >> any inconsistencies are due to small round-off errors. >> >> At 05:02 PM 2/3/2021, Henry Baker wrote: >> >Dan asks: what is the question? >> > >> >Find the (real) vectors U,V whose outer product >> >makes the (real) matrix; i.e., >> > >> >U' V = M >> > >> >Clearly, for any real alpha, >> > >> >(U/alpha)' (alpha*V) = M >> > >> >so the vectors are determined only up to >> >a single constant factor. >> > >> >Once again, the '=' wants to be as close >> >as possible within the constraints of FP >> >arithmetic. >> > >> >At 03:23 PM 2/3/2021, Henry Baker wrote: >> >>Input: an nxn(*) matrix M of floating point >> >>numbers (positive, negative or zero) which >> >>is *supposed* to have rank exactly one, i.e., >> >>it is the outer product of two n-element >> >>*unknown* vectors, and hence every row has >> >>a common factor and every column has a common >> >>factor, which factor can be positive, negative >> >>or zero. >> >> >> >>(*) The matrix doesn't have to be square, >> >>but it was easier to describe the square >> >>version of the problem >> >> >> >>Unfortunately, due to floating point >> >>approximations and and other issues, >> >>the values of M have small errors, so >> >>the columns and rows are not *exact* >> >>multiples of their common factors, and >> >>if the common factor is supposed to be >> >>zero, it may actually be a number >> >>exceedingly close to zero but at the >> >>minimum exponent range. >> >> >> >>NB: Due to *underflow*, floating >> >>point algebra has *zero divisors*, >> >>i.e., there exist x/=0, y/=0, s.t. >> >>x*y=0. >> >> >> >>Such a problem can trivially be posed >> >>as an SVD problem, but I wanted a >> >>more direct solution, because I wanted >> >>to get the 'best' fit to the data within >> >>the limits of the floating point >> >>arithmetic. >> >> >> >>I have a quite elegant solution and >> >>reasonably efficient solution, but >> >>I wanted to see if anyone else had >> >>the same idea. >-- > Andy.Latto(a)pobox.com

1 0

Re: [math-fun] Interesting matrix problem
by Henry Baker 05 Feb '21

05 Feb '21

We first show how to solve the problem when M is the outer product of *non-negative real* vectors. Lemma 1. Suppose we have a matrix M that purports to have the following structure: [a b c d]' [p q r s] = [ a p a q a r a s ] [ ] [ b p b q b r b s ] [ ] [ c p c q c r c s ] [ ] [ d p d q d r d s ] where >>> all of a,b,c,d,p,q,r,s >= 0 <<< The row sums are [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + p)] The column sums are [(d + c + b + a) p, (d + c + b + a) q, (d + c + b + a) r, (d + c + b + a) s] The total matrix sum is (d + c + b + a) (s + r + q + p) If M is the zero matrix, then M is the outer product of 2 zero vectors. If M isn't the zero matrix, then the row sum>0, the col sum>0, and the total sum>0. So we construct U' V = M as follows. U = the row sums: [a (s + r + q + p), b (s + r + q + p), c (s + r + q + p), d (s + r + q + p)] V = the column sums/(total sum): p q r s [-------------, -------------, -------------, -------------] s + r + q + p s + r + q + p s + r + q + p s + r + q + p Or, we can split the scale factor (total sum) evenly: U = row sums/sqrt(total sum): a sqrt(s + r + q + p) b sqrt(s + r + q + p) c sqrt(s + r + q + p) d sqrt(s + r + q + p) [---------------------, ---------------------, ---------------------, ---------------------] sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) sqrt(d + c + b + a) V = col sums/sqrt(total sum): sqrt(d + c + b + a) p sqrt(d + c + b + a) q sqrt(d + c + b + a) r sqrt(d + c + b + a) s [---------------------, ---------------------, ---------------------, ---------------------] sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) sqrt(s + r + q + p) So these scaled row-sums and scaled col-sums are the vectors we seek. QED NB: It should now be clear why the row-sums and col-sums won't work when we include negative elements; e.g., there exist matrices M whose row-sums and col-sums are both *zero*, so the procedure above fails completely. --- Now we handle the general case, where M consists of both positive and negative entries. We first find a,b,c,d>=0 and p,q,r,s>=0 s.t. [a b c d]' [p q r s] = abs(M), where abs(M) is M with all of its entries replaced by their absolute values. Now we assign +- *signs* to the elements a,b,c,d,p,q,r,s of U,V using the following scheme. Process the non-negative vectors U=[a b c d] and V=[p q r s] in descending order of absolute value. The largest element of M is the product of the largest element of U and the largest element of V. Assume that this element is |b|*|r| = |M[i,j]|. We need to assign *signs* to b and to r, s.t. (b*r) = M[i,j], i.e., so that the sign(b)*sign(r)=sign(M[i,j]). Since this is the first assignment, we can arbitrarily choose sign(b)=+1, and then set sign(r)=sign(M[i,j]). We then process each of the other elements of U and V in descending order of absolute values, choosing signs for the remaining elements of U and the remaining elements of V so they are *consistent* with the previous choices of signs. If any M[i,j]=0, it has no effect on any choice of signs for elements of U,V. Due to round-off or other errors, the assignment near the end of this process -- i.e., when the elements of U and V are very close to zero, and we may find inconsistencies. However, these inconsistencies are extremely small (assuming that the input matrix was indeed produced as an outer product), and by processing the elements of U,V in descending order of absolute value, we ensure that the signs are consistent for the *strongest signals*, so that any inconsistencies are due to small round-off errors. At 05:02 PM 2/3/2021, Henry Baker wrote: >Dan asks: what is the question? > >Find the (real) vectors U,V whose outer product >makes the (real) matrix; i.e., > >U' V = M > >Clearly, for any real alpha, > >(U/alpha)' (alpha*V) = M > >so the vectors are determined only up to >a single constant factor. > >Once again, the '=' wants to be as close >as possible within the constraints of FP >arithmetic. > >At 03:23 PM 2/3/2021, Henry Baker wrote: >>Input: an nxn(*) matrix M of floating point >>numbers (positive, negative or zero) which >>is *supposed* to have rank exactly one, i.e., >>it is the outer product of two n-element >>*unknown* vectors, and hence every row has >>a common factor and every column has a common >>factor, which factor can be positive, negative >>or zero. >> >>(*) The matrix doesn't have to be square, >>but it was easier to describe the square >>version of the problem >> >>Unfortunately, due to floating point >>approximations and and other issues, >>the values of M have small errors, so >>the columns and rows are not *exact* >>multiples of their common factors, and >>if the common factor is supposed to be >>zero, it may actually be a number >>exceedingly close to zero but at the >>minimum exponent range. >> >>NB: Due to *underflow*, floating >>point algebra has *zero divisors*, >>i.e., there exist x/=0, y/=0, s.t. >>x*y=0. >> >>Such a problem can trivially be posed >>as an SVD problem, but I wanted a >>more direct solution, because I wanted >>to get the 'best' fit to the data within >>the limits of the floating point >>arithmetic. >> >>I have a quite elegant solution and >>reasonably efficient solution, but >>I wanted to see if anyone else had >>the same idea.

2 1

[math-fun] Interesting matrix problem
by Henry Baker 05 Feb '21

05 Feb '21

Input: an nxn(*) matrix M of floating point numbers (positive, negative or zero) which is *supposed* to have rank exactly one, i.e., it is the outer product of two n-element *unknown* vectors, and hence every row has a common factor and every column has a common factor, which factor can be positive, negative or zero. (*) The matrix doesn't have to be square, but it was easier to describe the square version of the problem Unfortunately, due to floating point approximations and and other issues, the values of M have small errors, so the columns and rows are not *exact* multiples of their common factors, and if the common factor is supposed to be zero, it may actually be a number exceedingly close to zero but at the minimum exponent range. NB: Due to *underflow*, floating point algebra has *zero divisors*, i.e., there exist x/=0, y/=0, s.t. x*y=0. Such a problem can trivially be posed as an SVD problem, but I wanted a more direct solution, because I wanted to get the 'best' fit to the data within the limits of the floating point arithmetic. I have a quite elegant solution and reasonably efficient solution, but I wanted to see if anyone else had the same idea.

5 8