There is a whole class of "centers" which differ depending upon which "moment" (or combination of moments) are being considered. Of course, a symmetrical distribution will have the same centers, but highly asymmetrical distributions will have different centers (assuming that such a center even exists). The major problem with using classical means, medians, std's, etc., for distributions with heavy tails (e.g., Zipf's Law) is that they don't do a very good job of characterizing the distribution. You need an entirely new language to describe them. At 08:49 AM 9/29/2005, Mike Speciner wrote:
So, if y(x) is the histogram, the median is the m such that
integral(x<m) y(x) = integral(x>m) y(x)
while the mean is the m such that
integral(x<m) |x-m|*y(x) = integral(x>m) |x-m|*y(x)
This suggests a whole family of averages (using various functions of (x-m) for the weighting), though what use they might have escapes me.
--ms
David Gale wrote:
Jim, what is the Propp median if there are m zeros and m fives (and zero everything else)? Dan, if you're going to bring in averages at all then why not go all the way and use THE average? But maybe the CDC was using some sort of hybrid like the one you suggest. D At 09:14 PM 9/28/2005, you wrote:
The picture was supposed to show a rectangle of width 1 and height 2 whose bottom is centered at x=1, and to the right of it, a rectangle of width 1 and height 1 whose bottom is centered at x=2.
The base of the first rectangle goes from x=1/2 to x=3/2, and the base of the second rectangle goes from x=3/2 to x=5/2.
The total area under the histogram is (1)(2)+(1)(1) = 3.
The area to the left of the line x=5/4 is (5/4-1/2)(2) = 3/2, which is half of the total area. So x=5/4 is the "median".
Jim