Two or three points about this: First, the standard median depends only on the topology of the line. In that spirit, it might be best to leave the answer as either a point or an interval. There are many contexts where the topology is clear, but the linear structure is not. In 2 dimensions, any finite set is homeomorphic to any other finite set of the same cardinality---there can't be an analogous notion of median, without more structure. Dan's definition assumes the notion of convexity, which is equivalent to knowing what straight lines are. The only homeomorphisms of the plane that preserve straightness of lines are also affine, so the mean would also be preserved. You can't be so agnostic about linearity and get an answer! But more often, 2- dimensional data has natural coordinates. In that case, you may as well look at the median in the two axes. The ordering for 2 projections is a lot less structure than the affine structure, and may be preferable for most applications. Second: the convex peeling definition is interesting, but, how much can it jump around? I suspect it can jump too far to be a very good statistical measure. I.e. imagine you have 50 points that are nearly along the x-axis, and a bunch of other points scattered around in the upper half plane. In one arrangement, the 50 could all be on the boundary of the convex hull, but in a nearby arrangement (on an arc curved downward) it could take 25 levels of peeling to remove them all. Third: If you really have data where the affine structure makes sense, how about looking at the median for every linear projection, and just taking the convex hull of all answers you get as the "median"? Bill On Sep 29, 2005, at 3:51 PM, "" <dasimov@earthlink.net> wrote:
Joshua writes:
<< . . . Here is another question about the median: is there a median that makes sense in two or more dimensions?
Suppose (X,Y) ~ f(x,y) where f(x,y) is the continuous joint pdf of the random variables X and Y. Is there a reasonable quantity to call the median?
For a *finite uniform* distribution, a reasonable way to generalize the 1D median (on R) to R^n is to use "convex peeling":
do
Take the convex hull of the data, then remove data on the boundary of its convex hull;
while data remains.
When eventually a removal leaves no remaining data, put these last points back and let their vector mean be the median.
(For a continuous distribution, I suspect that by taking an arbitrarily large finite sample, the result of convex peeling should converge, with probability 1, to a point depending only on the original distribution.
Better yet, there's probably some differential equation that implements this limit without resorting to samples. WPT ?)
--Dan _______________________________________________ math-fun mailing list math-fun@mailman.xmission.com http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun