Is the maximum-likelihood estimation notation formally correct?

Question

Is the maximum-likelihood estimation notation formally correct?

711 Views Asked by Bumbble Comm At 05 Apr 2026 - 11:16

I just saw from the Wikipedia's entry on Maximum likelihood, http://en.wikipedia.org/wiki/Maximum_likelihood , the formula

$\mathcal{L}(\theta\,|\,x_1,\ldots,x_n) = f(x_1,x_2,\ldots,x_n\;|\;\theta) = \prod_{i=1}^n f(x_i|\theta).$

Could someone explain if this is formally correct? I mean, I haven't seen the definition of vertical bar in the function parameters and I feel that the mapping $f$ suddenly changes from $\mathbb{R}^n\to \mathbb{R}$ (or maybe $\mathbb{R}^{n+1}\to \mathbb{R}$) to $\mathbb{R}\to\mathbb{R}$ or $\mathbb{R}^2\to\mathbb{R}$.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 19 Mar 2014 - 12:26

The $x_i$ aren't arguments to $f$, they're observations/data which will be used to select model parameters, $\theta$. The vertical bar indicates "conditionality" in the sense "$A|B$" is "the event $A$ occurs under the assumption that $B$ occurs (or, to make it more clear in English, that $B$ is assumed to occur without question)". This can be phrased "the event $A$ after $B$" or "the event $A$ given $B$" when the temporal language is not conflated with temporal ordering in the events. See the English Wikipedia article on Conditional Probability for more on this.

This sentence reads: the likelihood of the set of parameters $\theta$ given the observations $x_1$ through $x_n$ is equal to the probability that the model with parameters $\theta$ generates the data $x_1$ through $x_n$ is equal to the product of the probabilities that the model with parameters $\theta$ generates each datum.

There's nothing wrong with the formal correctness.

**Bumbble Comm** · Accepted Answer

It is just notation meaning that whatever comes after the vertical bar is treated as fixed. If $X_1,\ldots,X_n$ are random variables whose joint density (or pdf) depends on a parameter, say $\theta$, living in some region $\Theta$, then we can define a function $$ \Theta\times\mathbb{R}^n\ni(\theta,x_1,\ldots,x_n)\mapsto f_\theta(x_1,\ldots,x_n) $$ that for each parameter value $\theta$ and each set of observations $x_1,\ldots,x_n$ returns the density (or pdf) corresponding to $\theta$ evaluated at $(x_1,\ldots,x_n)$.

Note that for fixed $\theta\in\Theta$, the function $$ \mathbb{R}^n\ni(x_1,\ldots,x_n)\mapsto f_\theta(x_1,\ldots,x_n) $$ is the joint density of $(X_1,\ldots,X_n)$ corresponding to that particular value of $\theta$. This is what is often denoted as $(x_1,\ldots,x_n)\mapsto f(x_1,\ldots,x_n\mid \theta)$ since $\theta$ is held fixed.

Now, if $x_1,\ldots,x_n$ are observations of the random variables $X_1,\ldots,X_n$, then the likelihood function is the function $$ \Theta\ni\theta\mapsto f_\theta(x_1,\ldots,x_n) $$ which is often denoted by $\mathcal{L}(\theta\mid x_1,\ldots,x_n)$ since we are varying $\theta$ for a fixed set of observations.

Is the maximum-likelihood estimation notation formally correct?

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in NOTATION

Trending Questions

Popular # Hahtags

Popular Questions