MLE for mixed distribution

257 Views Asked by At

Let $X\sim f(x)=\begin{cases}\lambda e^{-\lambda x},&x\in [0,5)\\e^{-5\lambda},&x=5\end{cases}$

How can I derive the MLE for this distribution? I know how to derive the MLE for $\lambda$ if $Y\sim\exp(\lambda)$ but I don't know how to work with this mixed distribution

3

There are 3 best solutions below

0
On

If you maximize the log-likelihood function, then the idea is the same. Specifically, you want to maximize:

$$ \ell(\mathbf{x}, \mathbf{\lambda}) = \sum_{x_i \in [0,5)}\ln\left[\lambda e^{-\lambda x_i}\right] + \sum_{x_i = 5}\ln\left[e^{-5\lambda }\right] $$

If we have $m$ data points such that $ x_i \in [0,5)$ and $n$ data points such that $x_i = 5$, then the above expression is the same as:

$$ \ell(\mathbf{x}, \mathbf{\lambda}) = m\ln\left[\lambda\right] - \lambda \sum_{x_i \in [0,5)} x_i - 5n\lambda $$

which is the same as::

$$ \ell(\mathbf{x}, \mathbf{\lambda}) = m\ln\left[\lambda\right] - \lambda \sum_{x_i \in [0,5)} \left(x_i + \frac{5n}{m} \right) $$

From here you can, see that this essentially leads to the traditional MLE estimator for an exponential distribution:

$$ \hat{\lambda}_{\mathrm{MLE}} = \frac{m}{ \sum_{x_i \in [0,5)} \left(x_i + \frac{5n}{m} \right)} $$

3
On

The likelihood function based on a sample of size $n$ can be written as

\begin{align} L(\lambda\mid x_1,\ldots,x_n)&=\prod_{i:0\le x_i<5} \lambda e^{-\lambda x_i}\prod_{i:x_i=5} e^{-5\lambda} \\&=\lambda^{\sum_{i=1}^n \mathbf1(0\le x_i< 5)}\exp\left\{-\lambda\sum_{i=1}^n x_i \mathbf1(0\le x_i< 5)-5\lambda\sum_{i=1}^n \mathbf1(x_i=5)\right\} \\&=\lambda^m\exp\left[-\lambda \left\{\sum_{i=1}^m x_i +5(n-m)\right\}\right]\qquad,\,\lambda>0 \end{align}

Here $m=\sum\limits_{i=1}^n \mathbf1(0\le x_i< 5)$ is the number of observations in the sample taking values in $[0,5)$.

To see why this works, take at look at this answer by @whuber on Cross Validated.

Log-likelihood is

$$\ell(\lambda\mid x_1,\ldots,x_n)=m\ln \lambda -\lambda \left\{\sum_{i=1}^m x_i +5\sum_{i=1}^n (n-m)\right\}$$

Differentiating the log-likelihood then yields the stationary point

$$\hat\lambda=\frac{m}{\sum\limits_{i=1}^m x_i+5(n-m)}$$

Another way to justify the answer here is to note that the random variable $X$ in the question has the distribution of $\min(Y,5)$ where $Y$ has an exponential distribution with mean $1/\lambda$. The likelihood is then based on Type-I (right) censored data, where the censoring occurs at the point $5$. As such the MLE is different from what is obtained in a usual Exponential model.

0
On

Maybe this is where knowing a bit of measure theory helps with down-to-earth concrete problems.

$f$ is presumably a sort of mixture of a probability mass function and a probability density function, and I take your way of stating the problem to mean that $\Pr(X=t) = e^{-5\lambda}$ and that for every (measurable) set $A\subseteq[0,5)$ you have $\Pr(X\in A) = \int_A f(x)\, dx.$

Imagine a measure according to which the measure of any set $A\subseteq[0,5)$ is just how much space in that interval the set $A$ takes up, e.g. the measure of the interval $(1,3)$ is $2$ and that of $(1,4)$ is $3,$ but according to which the measure of the set $\{5\}$ is $1.$

Call that measure $m.$ Then your function $f$ is the density of this probability distribution with respect to the measure $m,$ and that means that for every (measurable) set $A\subseteq[0,5],$ the following is true: $$ \Pr(X\in A) = \int_A f(x)\, dm(x) $$ where the integral is with respect to this measure $m.$

Now you have $$ \Pr(X_1=x_1\ \&\ \cdots\ \&\ X_n=x_n) = \prod_{i=1}^n f(x_i). $$ The likelihood function is the value of the expression above as a function of $\lambda.$

Here's a useful fact that I seldom see mentioned: Just suppose we had used a different measure $n,$ and $n(\{5\})=2\ne 1.$ In that case, the value of $f$ at $5$ will be $e^{-5\lambda}/2.$ Then this will alter the likelihood function only by multiplying it by a constant (and “constant” means not depending on $\lambda.$

Therefore the MLE will come out the same either way. And the product of the prior and the likelihood, normalized, yielding a posterior distribution of $\lambda,$ will still give the same results.