Quoting Roland Silver <rollos@starband.net>:
Bernie Cosell, All you need to do is compute on the fly the sum of squares S = x[1]^2 + ... + x[n]^2 of the empirical values, as well as the sum of values T = x[1] + ... + x[n]. The mean M = T/n, and the variance V = ((x[1] - M)^2 + (x[n] - M)^2)/(n-1). The SD is the square root of V. If you expand the terms in the equation for V, you get V = (S -2*M*T + n*M^2)/(n-1) = (S - n*M^2)/(n-1), which you can compute in one pass.
On Thursday, Jul 10, 2003, at 09:20 America/Denver, Bernie Cosell wrote:
... [an inquiry] ...
I don't know why people insist on putting that (n-1) in the denominator of the variance formula. Both mean and variance (as well as second moment} are >averages< of certain quantities, which means dividing by the number of data points, which is n. At first sight, you think you can't accumulate a variance since it is the second moment about the mean, which you don't know yet. But one of Newton's identities saves the day, because the total variance satisfies a quadratic identity with the total mean and the total second moment, both of which can be accumulated term by term, which is more or less the formula cited in these messsages, but with n and not (n-1). To check: the variance of >one< datum is >zero<. That >(n-1)< is a Classical, Canonical source of confusion amongst statisticians, who often don't bother with mathematics, and enshrined in the folklore by a publication of the National Bureau of Standards from some years ago when one turned to the govenrment for help on these mysteries. The reason for the confusion is: The Variance of the Mean is NOT the Mean of the Variance, and if the two quantities are both computed and then compared, the roles of (n) and (n-1) are readily apparent. This is why talking of "estimators" does not exactly answer the question first proposed. - hvm ------------------------------------------------- Obtén tu correo en www.correo.unam.mx UNAMonos Comunicándonos