Variance of a sample median

532 Views Asked by At

Suppose $X_1,\ldots,X_n\sim\text{i.i.d.}\operatorname N(\mu,\sigma^2).$

I think I've only ever seen one way to prove that the sample mean of $X_1,\ldots,X_n$ has a smaller variance than does the sample median, and it uses some moderately hefty results and doesn't say what the variance of the sample median is.

Specifically, two theorems are used: The Lehmann–Scheffé theorem from the theory of estimation, and the one-to-one-ness of the two-sided Laplace transform: $$ \left( \mathcal L g\right)(\theta) = \int_{-\infty}^{+\infty} g(x) e^{\theta x} \, dx. \\[16pt] \text{If } \mathcal L g = \mathcal L h \text{ then } g=h \text{ a.e.} $$ Is there an elementary and efficient way to show that the sample median has a larger variance than the sample mean?

And (here's the question on a specific definite integral, justifying one of the tags) is there a closed form for the variance of the sample median?

(Here someone could possibly object that this whole thing is trivially reducible to the case where $\mu=0$ and $\sigma=1.$ The two theorems mentioned above both have hypotheses saying either than something doesn't change as $(\mu,\sigma^2)$ changes or that something is true of all values of $(\mu,\sigma^2).$ So I suppose you could construe this question like this: Supposing the hypothesis $\mu=0,\sigma=1$ is assumed, since there is clearly no loss of generality. How do you prove the result then? Is this a case where it is better to forgo a simplifying assumption that discards no generality?)

1

There are 1 best solutions below

1
On

Some clues: The mean $A_n=\bar X_n$ of a normal sample of size $n$ is unbiased and based on the sufficient statistic for $\mu$ so how could the sample median $H_n$ (also unbiased) but not based on the sufficient statistic have a smaller variance?

Moreover, a basic theorem on asymptotic normality of order statistics (except for max and min), states that: $\frac{H_n - \mu}{c/\sqrt{n}}$ converges to standard normal as $n\rightarrow\infty,$ where $c^2 = 1/4\phi(0) = 2\pi/4=\pi/2,$ so the asymptotic variance is $c^2/n = 1.571/n,$ compared with $1/n$ for $\bar X_n.$

The following simulation based on a million normal samples of size $n=100$ provides an approximation (to about three places).

set.seed(2020)
h = replicate(10^6, median(rnorm(100)))
var(h)
[1] 0.01547719  # aprx 1.571/100 

Also for $n=1000.$

set.seed(714)
h = replicate(10^6, median(rnorm(1000)))
var(h)
[1] 0.001568231