Closed expression and physical interpretation of the median

141 Views Asked by At

Opposed to the arithmetic mean there is no immediate closed expression for the median of a distribution $n(x)$ of a variable $x\in\mathbb{N}$ over a population of $N$ items, at least not when introducing the median to novices.

enter image description here

The arithmetic mean is just

$$\bar{x} = \frac{1}{N}\sum_{x=0}^{\infty}x\ n(x)$$

with $N = \sum_{x=0}^{\infty}n(x)$. Physically interpreted the arithmetic mean is the value of the point of mechanical equilibrium $\blacktriangle$ of the above arrangement of items (each having the same weight), shifted by $0.5$:

enter image description here

Now let $N$ be odd. Then the median is usually introduced like this:

Sort all items by their value $x$. The median is the value of the $\frac{N+1}{2}$th item.

Physically interpreted the median is the value of the item at the point of mechanical equilibrium of this arrangement of items:

enter image description here


It turns out that one may get a concise closed expression for the median that has immediately to do with this definition and interpretation of the median.

First define the distribution function $n_\leq(X)$ which gives the number of items with value $x \leq X$ (not equal $X$ as in the original distribution):

$$n_\leq(X) = \sum_{x=0}^X n(x)$$

The plot of $n_\leq(x)$ looks like this:

enter image description here

Since this function is monotonic there's a well-defined inverse $n_\leq^{-1}$ which looks like this (reflexion at the diagonal):

enter image description here

or like this:

enter image description here

or like this:

enter image description here

Note that this inverse function can be interpreted easily: it gives the value of the $n$th item when the items were sorted (somehow) by $x$.

Now the median is just the value of the inverse function $n_\leq^{-1}$ (for which we have a closed expression) for the argument $\frac{N+1}{2}$:

enter image description here

[Side note: The number $n$ changed its character in the course of the transformations $n(x) \rightarrow n_\leq \rightarrow n_\leq^{-1} \rightarrow x(n)$. In $n(x)$, $n$ is a cardinal number, in $x(n)$, $n$ is a ordinal number.]

My questions are:

  1. Is there a standard reference where this way of gently introducing the median is chosen?

  2. Can it be understood physically that the arithmetic mean equals the median iff $n(x)$ is symmetric with respect to some $x_0$, i.e. $n(x_0 - x) = n(x_0 +x)$?

  3. How does one call and interpret the inverse function $p_\leq^{-1}$ in case of a continuous probability distribution $p(x)$ (when there are not items to be sorted anymore)?

1

There are 1 best solutions below

0
On BEST ANSWER

1) I'm not sure about standard references. But the usual definition

Sort all items by their value x. The median is the value of the $\frac{N+1}{2}$th item.

is excessively procedural, and refers to a slower-than-needed algorithm. A better definition nearby is:

The median is an $x$ such that at least half the values are $\ge x$, and at least half the values are $\le x$.

2) Yes. Here is one physical intepretation with the symmetry:

Suppose you have a 2-d object of uniform density, with a vertical line of symmetry through $p$. Then putting your finger directly underneath $p$ both balances the object [$p$ is in the same line as the mean location] and gets equal weight on both sides [$p$ is in the same line as the median location]. So the mean and median location are the same.

By contrast, if you take an L shape, putting your finger in the median location will cause the L to fall over to the right.

3) The inverse of the cumulative probability function is called the quantile function. There's rarely a closed form for it, even when the cumulative probability function has a closed form, but even so I'm surprised it's so often neglected.