You sample from a uniform distribution $[0, d]$ $n$ times. What is your best estimate of $d$, using on the variance of the samples drawn?

1.7k Views Asked by At

Suppose I have $X_1, X_2, ..., X_n$ where $X \sim \text{Uniform}[0,d]$.

Since, $E[X]$ = $\frac{d}{2}$, an obvious estimator for $d$ would be $2\cdot\bar{X}$, where $\bar{X}$ is the sample mean of $X$.

How would you go about estimating $d$ using the sample variance instead? I know that $Var[X] = \frac{d^2}{12}$. If you were to estimate $d$ using $\sqrt{12\cdot S^2}$ where $S^2$ is the sample variance of X, you would get a biased estimate of $d$ due to Jensen's inequality. Is there some "correction" you can add to this estimate to correct for the bias?

1

There are 1 best solutions below

0
On

With a random sample from $\mathsf{Unif}(0,\delta),$ if you insist on using the sample variance $S^2$ to estimate $\delta$ you can do it, but it isn't the best way to estimate $\delta.$

Notice that the variance of $\mathsf{Unif}(0,\delta)$ is $\sigma^2 = \delta^2/12,$ so the method of moments estimator is $\tilde \delta = 2\sqrt{3}\,S,$ where $S$ is the sample standard deviation.

Let's try it with a huge sample of size $n = 1000$ from $\mathsf{Unif}(0, 10)$ simulated in R. The estimate is $\tilde\delta = 10.007.$

set.seed(2020)
d = 10;  x = runif(1000, 0, d);  s = sd(x)
MME.d = sqrt(12)*s;  MME.d
[1] 10.00703

However, one can show that the unbiased maximum likelihood estimator of $\delta$ is $\hat\delta = \frac{n+1}{n}X_{(n)},$ where $X_{(n)}$ is the maximum observation: For the large dataset above this is $\hat \delta = 10.003.$

(1001/1000)*max(x)
[1] 10.00316

For samples of small and moderate size the unbiased maximum is often noticeably better. Let's look at 10,000 samples of size $n = 20$ from $\mathsf{Unif}(0,\, \delta=10).$

set.seed(826)
m = 10^4;  n = 20;  x = runif(m*n, 0,10)
MAT = matrix(x, nrow=m)  # each row a sample of 20
mme = sqrt(12)*apply(MAT, 1, sd)
mle = ((n+1)/n)*apply(MAT, 1, max)
mean(mme); var(mme)
[1] 9.955564       
[1] 1.16755     # larger variance
mean(mle); var(mle)
[1] 10.00105
[1] 0.227096    # smaller variance

Both estimators are (nearly) unbiased, but the unbiased MLE has a much smaller variance $0.023$ compared with $1.17$ for the MME.

Plots of the simulated distributions of the estimators are shown below:

enter image description here