Why are maximum likelihood estimators used?

Question

Why are maximum likelihood estimators used?

7.4k Views Asked by Bumbble Comm At 30 Mar 2026 - 2:38

Is there a motivating reason for using maximum likelihood estimators? As for as I can tell, there is no reason why they should be unbiased estimators (Can their expectation even be calculated in a general setting, given that they are defined by a global maximum?). So then why are they used?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 24 Feb 2012 - 5:42

Unbiasedness is overrated by non-statisticians. Sometimes unbiasedness is a very bad thing. Here's a paper I wrote showing an example in which use of an unbiased estimator is disastrous, whereas the MLE is merely bad, and a Bayesian estimator that's more biased than the MLE is good.

Direct link to the pdf file: https://arxiv.org/pdf/math/0206006.pdf

(Now I see Sasha already cited this paper.)

**Bumbble Comm** · Accepted Answer

The principle of maximum likelihood provides a unified approach to estimating parameters of the distribution given sample data. Although ML estimators $\hat{\theta}_n$ are not in general unbiased, they possess a number of desirable asymptotic properties:

consistency: $\hat{\theta}_n \stackrel{n \to \infty}{\to} \theta$
normality: $ \hat{\theta}_n \sim \mathcal{N}( \theta, \Sigma )$, where $\Sigma^{-1}$ is the Fisher information matrix.
efficiency: $\mathbb{Var}(\hat{\theta}_n)$ approaches Cramer-Rao lower bound.

Also see Michael Hardy's article "An illuminating counterexample" in AMM for examples when biased estimators prove superior to the unbiased ones.

**Added**

The above asymptotic properties hold under certain regularity conditions. Consistency holds if

parameters identify the model (this ensure existence of the unique global maximum of the log-likelihood function)
parameter space of the model is compact,
log-likelihood function is continuous function of parameters for almost all $x$,
log-likelihood is dominated by an integrable function for all values of parameters.

Asymptotic normality holds if

the estimated parameters are away from the boundary of the parameter domain,
distribution domain does not depend on distribution parameters $\theta$,
the number of nuisance parameters does not depend on the sample size

Why are maximum likelihood estimators used?

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in MAXIMUM-LIKELIHOOD

Trending Questions

Popular # Hahtags

Popular Questions