Estimating the parameter lambda in exponential distribution

318 Views Asked by At

enter image description here

I'm reading the following from page 60 of Information theory and Machine Learning. My questions are the following

  1. Under the assumption that $\lambda \ll 20$ why is $\bar{x}-1$ a good estimator, could someone add the detail explaining it?
  2. What kind of ad hoc binning techniques would work for $\lambda\gg20$ ?

I know the author merely introduces his thought process so that the supervisor eventually leads him to a bayesian way of thinking but i'm interested in why his logic for these particular cases works even if the solution is not a unifying one. Thanks!

1

There are 1 best solutions below

2
On

If $\lambda$ is small so $\mathbb P(X \ge 20) \approx 0$, you can say $$\mathbb E[X \mid 1 \lt X \lt 20] \approx \mathbb E[X \mid 1 \lt X ] = \lambda +1$$ by the memoryless property of the exponential distribution, and that may suggest $\hat{\lambda} = \overline{x}-1$ as an estimator using this truncated data

In general $$\mathbb E[X \mid 1 \lt X \lt 20] = \dfrac{\int_1^{20} \frac{x}{\lambda} e^{-x / \lambda} dx}{\int_1^{20} \frac{1}{\lambda} e^{-x / \lambda} dx}= \lambda + 1 -\dfrac{19}{e^{19/\lambda}-1}$$
which for very large $\lambda$ is close to $\dfrac{21}{2} - \dfrac{361}{12\lambda}$ so it might suggest something like $\hat{\lambda} = \dfrac{361}{126 - 12\overline{x}}$ as a possible approximate estimator using this truncated data, though noting that this will produce nonsense in cases where $\overline{x}\ge 10.5$

As an illustration of binning: if you observe $Y$ lengths in a bin between $1$ and $10.5$ and $Z$ in a bin from $10.5$ to $20$, then $\mathbb{E}[Y] = e^{19/(2\lambda)}\mathbb{E}[Z]$, so a possible estimator for $\lambda$ is $\hat{\lambda} = \dfrac{19}{2(\log_e(Y) - \log_e(Z))}$, though this will produce nonsense if $Y=0$ or $Z=0$ or $Y \le Z$

You can avoid these risks of nonsense with a proper Bayesian prior, though observations which might otherwise lead to nonsense are likely to give posterior distributions constrained by that prior and in particular any prior upper limit on $\lambda$