[math-fun] Combining results of scientific experiments -- CODATA claimed to be wrong
Combining results of scientific experiments CODATA tells us the values of physical constants, such as proton radius, based on combining results of experiments by different people. I will here argue they are doing it wrong and consequently all scientific "fundamental constants" from CODATA are screwed up. 1. The way they do it (naive description) is: each experiment gives a value V and an error bar +-B. They then weighted-average all the values, using weights=1/B^2, to obtain final value Vcombined. And they sum the squares of the B's and compute the square root to get Bcombined. 2. The way they do it (less naive) is that many of the physical constants and measured quantities are theoretically related, namely the logs of these obey certain known linear relations. So there is a high-dimensional space of all these log-quantities, which thanks to the linear relations is really a lower-dimensional subspace (which is still pretty high dimensional). And they work in this space using "error bars" which really are ellipsoids. Also there are additional issues related to handling "systematic" versus "statistical" errors. 3. The less-naive version happens to be the same as the naive description if no linear relations involved. 4. What is wrong with that? The underlying justification for the value-combining via weighted-average method in (1), is that each measurement is assumed to be normally distributed with the claimed mean and std.deviation. Then this produces a max-likelihood estimate. Which also for normals happens to be the Bayesian estimate. But in fact, that is JUST FALSE. Example, 137.035 999 679 (94) Previous CODATA 2006 value of 1/alpha... oops. 137.035 999 049 (90) CODATA 2010, mostly from Rb spectroscopy which disagree at 4.84 sigma, which is to say, >99.99995% confidence that those were NOT normal errors. There are a ton of such examples. Experimenters are constantly getting their error estimates wrong, or the statistics are nongaussian -- which amount to the same thing. The number of times experimenters have been off over 10sigma is large, massively refuting the gaussianity assumption. 5. So the correct thing to do, is to combine results NOT using an underlying assumption of gaussianity, but rather using the CORRECT assumption, based on historical record, of the true non-gaussian error distribution. (And frankly the correct distribution is more like Cauchy than Gaussian...) I think a decent model would be more like power-law tails where the correct power would need to be determined by a fit to historical data. I would suggest a density that is something like the gaussian density out to 1.5sigma, with power law tails surgically grafted on beyond that. The class of "stable densities" seems theoretically motivated. 6. CODATA implicitly already admits this -- they say they discard data as "outliers that they do not believe" based on their inner prejudices and judgments. If however they were using the right nongaussian assumption they would not need to do that discarding. 7. So... how should data be combined if we instead assume power law tails? The density we are recommending is log-concave and hence the max-log-likelihood estimate can be got by a concave-down function maximization problem (or in multidimensions) for which there is a polynomial time algorithm. This will still produce a weighted average as the combination, but the weights will be different. 8. And how should errors be combined? One answer is via whatever weighted average you used to combine the data -- use those to combine the variances. A different answer is, if we are using "stable densities" then when these combine you still get a stable density and just state its parameters. There would with stables actually be no variance (it would be infinite) and the meaning of the "error bar" would be something else. This is kind of a whole new way to do statistics I'm suggesting, but it also clearly seems needed. -- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
How much difference would your suggestion make to the currently agreed values? WFL On 11/1/13, Warren D Smith <warren.wds@gmail.com> wrote:
Combining results of scientific experiments CODATA tells us the values of physical constants, such as proton radius, based on combining results of experiments by different people.
I will here argue they are doing it wrong and consequently all scientific "fundamental constants" from CODATA are screwed up.
1. The way they do it (naive description) is: each experiment gives a value V and an error bar +-B. They then weighted-average all the values, using weights=1/B^2, to obtain final value Vcombined. And they sum the squares of the B's and compute the square root to get Bcombined.
2. The way they do it (less naive) is that many of the physical constants and measured quantities are theoretically related, namely the logs of these obey certain known linear relations. So there is a high-dimensional space of all these log-quantities, which thanks to the linear relations is really a lower-dimensional subspace (which is still pretty high dimensional). And they work in this space using "error bars" which really are ellipsoids. Also there are additional issues related to handling "systematic" versus "statistical" errors.
3. The less-naive version happens to be the same as the naive description if no linear relations involved.
4. What is wrong with that? The underlying justification for the value-combining via weighted-average method in (1), is that each measurement is assumed to be normally distributed with the claimed mean and std.deviation. Then this produces a max-likelihood estimate. Which also for normals happens to be the Bayesian estimate.
But in fact, that is JUST FALSE. Example, 137.035 999 679 (94) Previous CODATA 2006 value of 1/alpha... oops. 137.035 999 049 (90) CODATA 2010, mostly from Rb spectroscopy which disagree at 4.84 sigma, which is to say, >99.99995% confidence that those were NOT normal errors.
There are a ton of such examples. Experimenters are constantly getting their error estimates wrong, or the statistics are nongaussian -- which amount to the same thing. The number of times experimenters have been off over 10sigma is large, massively refuting the gaussianity assumption.
5. So the correct thing to do, is to combine results NOT using an underlying assumption of gaussianity, but rather using the CORRECT assumption, based on historical record, of the true non-gaussian error distribution. (And frankly the correct distribution is more like Cauchy than Gaussian...) I think a decent model would be more like power-law tails where the correct power would need to be determined by a fit to historical data. I would suggest a density that is something like the gaussian density out to 1.5sigma, with power law tails surgically grafted on beyond that. The class of "stable densities" seems theoretically motivated.
6. CODATA implicitly already admits this -- they say they discard data as "outliers that they do not believe" based on their inner prejudices and judgments. If however they were using the right nongaussian assumption they would not need to do that discarding.
7. So... how should data be combined if we instead assume power law tails? The density we are recommending is log-concave and hence the max-log-likelihood estimate can be got by a concave-down function maximization problem (or in multidimensions) for which there is a polynomial time algorithm. This will still produce a weighted average as the combination, but the weights will be different.
8. And how should errors be combined? One answer is via whatever weighted average you used to combine the data -- use those to combine the variances. A different answer is, if we are using "stable densities" then when these combine you still get a stable density and just state its parameters. There would with stables actually be no variance (it would be infinite) and the meaning of the "error bar" would be something else.
This is kind of a whole new way to do statistics I'm suggesting, but it also clearly seems needed.
-- Warren D. Smith http://RangeVoting.org <-- add your endorsement (by clicking "endorse" as 1st step)
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun
participants (2)
-
Fred Lunnon -
Warren D Smith