maximum likelihood when parameter space is discrete

550 Views Asked by At

Suppose that the data consists of only one result $X$. It has the probability mass function $f_X(x)=\mathbb{P}\left(X=x \right)$. $f_X(x)$ depends on a parameter $\theta$ which is to be estimated. $\theta$ can have only discrete values, for example $1, 2 \ \mathrm{and} \ 3 $.

The likelihood function is $\mathcal{L}\left( \theta; X \right)$. The estimate for $\theta$ is such that it maximizes $\mathcal{L}$. How is the estimate chosen if (for example) $\theta=1 \ \mathrm{and} \ \theta=2 $ both maximize the likelihood function?

2

There are 2 best solutions below

0
On BEST ANSWER

The MLE is not necessarily unique, hence if for a certain probability mass function both $\theta=1$ and $\theta = 2$ maximize the likelihood function, then their both are the MLEs, and it does not matter which one you take.

0
On

Let's consider an example. Define $X$ as follows:

$$\begin{array}{c|c|c|c} x & \Pr[X = x \mid \theta = 1] & \Pr[X = x \mid \theta = 2] & \Pr[X = x \mid \theta = 3] \\ \hline 0 & 0.1 & 0.3 & 0.4 \\ 1 & 0.3 & 0.2 & 0.3 \\ 2 & 0 & 0.3 & 0 \\ 3 & 0.2 & 0 & 0.1 \\ 4 & 0 & 0.1 & 0 \\ 5 & 0.4 & 0.1 & 0.2 \\ \end{array}$$ Note that the fact that $X$ has a probability distribution manifests in the table in the form of having the columns sum to $1$. But we do not require the rows to sum to $1$, and in general, they do not.

If we observe the outcome $X = 1$, the likelihood of $\theta = 1$ is equal to the likelihood of $\theta = 3$: $$\mathcal L(\theta = 1 \mid X = 1) = \mathcal L(\theta = 3 \mid X = 1) = 0.3.$$ In this case, the maximum likelihood estimate is not unique; either value of $\theta$ is equally likely to have generated the observation $X = 1$. But as we can see from the table, any other outcome for $X$ does admit a unique maximum likelihood estimate.

This notion of non-uniqueness of MLE is not limited to discrete-valued random variables or even discrete-valued parameters. It is not difficult to visualize and construct a likelihood function on a continuous parameter space that doesn't have a unique maximum.