if you really want to do it right, you need to weight the error in the x-direction and the error in y-direction inversely as precision of the data point in those directions. There's a book by Wolberg, "Prediction Analysis" which goes into all this in detail. Brent On 10/4/2018 9:41 AM, Lucas, Stephen K - lucassk wrote:
And let's not forget that there are good statistical reasons for minimizing the sums of squares of vertical distances, since this gives you the best estimate if y is actually a linear function of x plus Gaussian noise. If you assumed the noise was in the x data, not the y, then looking at minimizing horizontal distance would be appropriate, but almost always it is the y variable that is considered noisy, and x is known exactly.
Finding the line of best fit that minimizes perpendicular distance is indeed hairier, and is something that is considered when fitting curves to data, higher order polynomials or circle arcs, for example. This is moving away from the original statistical reason for why the linear line of best fit is so useful, but sometimes you just want to fit a curve. However, I vaguely remember reading somewhere that minimizing perpendicular distance tends to give you best fit curves that under estimate the curvature, and more sophisticated methods are required for arbitrary curves. Unfortunately, I can’t find the reference at this time.
On Oct 4, 2018, at 12:28 PM, Tom Duff <td@pixar.com> wrote:
Standard least squares measures the *vertical* distance to the line from each point and minimizes the sum of the squares. Interchanging x and y is equivalent to switching to measuring the *horizontal* distance. You can instead measure the perpendicular distance, which is invariant under interchanging x and y (or any other isometry for that matter) but the solution is correspondingly hairier.
On Thu, Oct 4, 2018 at 9:24 AM Henry Baker <hbaker1@pipeline.com> wrote:
In common discussions of least squares, the parameters (m,b) are estimated for the equation y = m*x+b using as data various datapoints [x1,y1], [x2,y2], [x3,y3], etc.
For example, in Wikipedia (where m=beta2 and b=beta1):
https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_L...
So far, so good.
Now, if I merely exchange x and y, then my equation is x = m'*y+b', where should be m' = 1/m and b' = -b/m. (Let's ignore the case where the best m=0.)
However, if I then estimate (m',b') using the same least squares method, I don't get (1/m,-b/m) !
So either I'm doing something wrong, or perhaps there is a more symmetric least squares method that treats x and y symmetrically ??
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://urldefense.proofpoint.com/v2/url?u=https-3A__mailman.xmission.com_cg...
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://urldefense.proofpoint.com/v2/url?u=https-3A__mailman.xmission.com_cg...
math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun