On likelihood functions

55 Views Asked by At

My professor said that, assuming independence, likelihood functions are basically crafted from the product of the individual probability density functions for each observation in a sample.

Thus, for example, if we are talking about the exponential distribution, then the likelihood function will look something like \begin{align} L(\theta; x_1, \dots, x_n) & = \prod^n_{i = 1} f_X(x_i; \theta)\\ & = \prod^n_{i = 1} \frac 1 {\theta} e^{-\frac {x_i} {\theta}}. \end{align}

If I follow this logic, then why is it, for a binomial distribution, the likelihood function is not \begin{align} L(p; x_1, \dots, x_n) & = \prod^n_{i = 1} f_X(x_i; p)\\ & = \prod^n_{i = 1} \binom n {x_i} p^{x_i} (1 - p)^{n - x_i} \end{align} but rather $$L(p; x_1, \dots, x_n) = \binom n x p^x (1 - p)^{n - x}?$$ Is the latter not simply the probability mass function of the binomial distribution? Why do we not need to take the product in this case? What am I missing here?

I am still finding the concept a little confusing, so any help will be greatly appreciated :)

1

There are 1 best solutions below

0
On BEST ANSWER

This equation $$L(p; x_1, \dots, x_n) = \binom n x p^x (1 - p)^{n - x} \tag{1a}$$ that you wrote does not make sense. The LHS contains the observed data $(x_1, \ldots, x_n)$, but on the RHS, you have just $x$, which you do not define.

Moreover, in both of your equations, the $n$ on the LHS pertains to the sample size (the number of observations), whereas on the RHS, $n$ corresponds to the number of trials in the binomial distribution. So $$\begin{align} L(p; x_1, \dots, x_n) & = \prod^n_{i = 1} f_X(x_i; p)\\ & = \prod^n_{i = 1} \binom n {x_i} p^{x_i} (1 - p)^{n - x_i} \end{align} \tag{1b}$$ is also incorrect, unless you mean that the number of trials in your binomial distribution always equals your sample size for the likelihood.

That said, if $x_1, \ldots, x_n$ are independent and identically distributed observations from a Bernoulli distribution with $\Pr[X_i = 1] = p$, then the sum $$X = \sum_{i=1}^n X_i$$ is binomial with parameters $n$ and $p$, and the likelihood of $p$ is $$\mathcal L(p; x_1, \ldots, x_n) \propto \prod_{i=1}^n p^{x_i} (1-p)^{1 - x_i} = p^x (1-p)^{n-x}. \tag{2}$$ Note the binomial coefficient $\binom{n}{x}$ is not necessary because it is a constant with respect to $p$.

In the general case, if $x_1, \ldots, x_n$ are binomial with parameters $m$ and $p$ (note the use of $m$ because $n$ is being used to describe the number of binomial observations), then the joint likelihood of $p$ is $$\mathcal L(m, p; x_1, \ldots, x_n) = \prod_{i=1}^n \binom{m}{x_i} p^{x_i} (1-p)^{m - x_i} = \left(\prod_{i=1}^n \binom{m}{x_i}\right) p^{n \bar x} (1-p)^{n(m - \bar x)},\tag{3}$$ where $$\bar x = \frac{1}{n} \sum_{i=1}^n x_i. \tag{4}$$ If $m$ is fixed and known, then the likelihood of $p$ can ignore the product of binomial coefficients and we can write it as $$\mathcal L(p; m, x_1, \ldots, x_n) = p^{n \bar x} (1 - p)^{n(m-\bar x)}. \tag{5}$$ Then when $m = 1$, we recover $(2)$ from $(5)$.