Correctly calculating the bias of an estimator

675 Views Asked by At

I'm currently learning about method of moments and maximum-likelihood estimators and have confused myself with this issue:

First, let me estimate the parameter $\lambda$ from the exponential distribution using the method of moments (note: you get the same result with a ML estimate):
$M_1 = \sum\limits_{i=1}^{n}X_i = \bar{X}$ (the first sample moment)
$m_1 = \frac{1}{\lambda}$ (the first moment of $X$)

setting these equal, we get:
$\bar{X} = \dfrac{1}{\tilde\Lambda}$
and thus:
$\tilde\Lambda = \dfrac{1}{\bar{X}}$

Now, if I want to calculate the bias of $\tilde\Lambda$, I'd use the definition for bias:
$B[\tilde\Lambda] = E{[\tilde\Lambda]} - \lambda$

however, since $\tilde\Lambda = \dfrac{1}{\bar{X}}$, $B[\tilde\Lambda] = E{\left[\dfrac{1}{\bar{X}}\right]} - \lambda$

but obviously this doesn't really make sense, since in general $E{\left[\dfrac{1}{X}\right]} \ne \dfrac{1}{E{[X]}}$, and I can't really reduce this equation.

Any help trying to figure out where my logic is wrong would be much appreciated!

1

There are 1 best solutions below

0
On BEST ANSWER

The mathematical derivation in the link is correct. The trick, avoiding evaluation of a messy integral by noticing the relationship of its integrand to a PDF, is very widely used and worth adding to your personal bag of tricks.

Part of your confusion might be in the wording: The distribution of $\bar X$ might be mistaken for the distribution of $\sum_{i=1}^n X_i.$ The distribution you want is $\bar X \sim \text{Gamma}(\text{shape}=n, \text{rate}=n\lambda).$ In summary, $E(\bar X) = 1/\lambda,$ but $E(1/\bar X) = \frac{n}{n-1}\lambda > \lambda,$ where $\lambda$ is the exponential rate and $1/\lambda$ is the exponential mean. Thus, the MME $\hat \lambda$ is a biased estimator of $\lambda$. An unbiased estimator of $\lambda$ is $\frac{n-1}{n}\hat \lambda = \frac{n-1}{n\bar X} = \frac{n-1}{\sum_i X_i}.$

As a demonstration of the correct distribution of $\bar X$, the R code below simulates a million samples of size $n = 10$ from $Exp(\lambda = .2).$ Then a histogram of the simulated distribution of $\bar X$ is plotted along with the density function of the theoretical distribution.

 m = 10^6; n = 10;  lam = 1/5
 x = rexp(m*n, rate=lam)
 DTA = (matrix(x, nrow=m))  # each row a sample of n
 a = rowMeans(DTA)          # vector of m sample means
 mean(a)
 ## 4.998666                # close to 5
 mean(1/a)
 ## 0.2222986               # biased: too far from 1/5
 n*lam/(n-1)
 ## 0.2222222               # true expectation of est
 mean((n-1)/(n*a))
 ## 0.2000687               # unbiased 

 hist(a, prob=T, col="wheat", xlab="Sample Mean", 
     main="Means of 10 Obs from EXP(rate=.2)")
 curve(dgamma(x, n, n*lam), col="blue", add=T)  # note PDF

enter image description here