"Likelihood" definition and probability

870 Views Asked by At

I cant understand the meaning of "likelihood". also, I can't understand difference between "likelihood" and "probability".

1

There are 1 best solutions below

2
On BEST ANSWER

Two examples may help clarify the distinction.

Binomial. Consider $n$ independent Bernoulli trials with Success probability $p.$ The PDF of $\mathsf{Binom}(n, p)$ is $$f(x\,|\,p) = {n \choose p}p^x(1-p)^{n-x}, \text{ for }\; x = 0, 1, \dots n.$$

If we know $p,$ PDF makes it possible to find probabilities $P(X = x).$ For example, if $n = 3$ and we know $p = 1/2$ Then $P(X = 2) = 3/8.$ Here $p$ is a given constant and $x$ is a value that might be assumed by the random variable $X.$

In a different setting, suppose we don't know $p$ and we observe $X = 2.$ If we are trying to estimate $p,$ we can use the likelihood function. It is

$$f(x\,|\,p) = {n \choose x}p^x(1-p)^{n-x} \propto L(p) = p^x(1-p)^{n-x}, \text{ for }\; 0 < p < 1.$$

Often likelihood functions are defined "up to a constant." The symbol $\propto$ (read "proportional to") shows that the constant ${n\choose x}$ has been omitted. Here $x$ is an observed value of the random variable $X$ and $p$ might have any value in $(0,1).$ For example, knowing $x = 2,$ we might say what value of $p$ is most likely.

Given $x=2,$ what value of $p$ maximizes the likelihood function. We can use calculus to find that the answer if the estimate $\hat p = 2/n = 2/3$ of $p.$ Here is a plot of the likelihood function $L(p) = p^2(1-p).$ Remember that it is now considered as a function of $p$ --- with a maximum at $\hat p = 2/3.$

enter image description here

Exponential. Consider a population exponentially distributed with mean $\mu.$ its PDF is $f(x\,|\,\mu) = \frac{1}{\mu}e^{-x/\mu},$ for $x > 1.$ If we have a random sample $X_i,X_2,\dots,X_n$ from this distribution. Then the joint density function is

$$f(\mathbf{x}|\mu) = \prod_{i=1}^n (1/\mu)^n \exp\left(-\frac{1}{\mu}\sum_{i=1}^n x_i\right) = (1/\mu)^n e^{-n\bar x/\mu},\;\text{ for }\, x_i > 0.$$ This multivariate density function could be integrated over various regions to find probabilities.

Now, suppose we have observed the sample mean $\bar X = \bar x$ of $n = 10$ observations. The likelihood function is $L(\mu) = (1/\mu)^n e^{-n\bar x/\mu}.$ We can ask what value $\hat \mu$ has the largest likelihood for a particular observed sample mean $\bar x.$ Suppose $\bar x = 10.5.$ Then one can use calculus to find that the likelihood function is maximized at $L(\mu) = \hat \mu =\bar x = 10.5.$ [For the calculus and other details of exponential distributions, see your text or the Wikipedia article, where $\mu = 1/\lambda.$]

enter image description here