Why does knowing that the MLE of the mean = the arithmetic mean of the sample uniquely determine the Normal sampling distribution?

76 Views Asked by Bumbble Comm At 24 Feb 2026 - 12:14

Currently reading through E.T Jayne on Bayesian inference and in one chapter dedicated to the Normal distribution he argues that Gauss proved the following:

Given a sample from a population distribution with unknown mean, our MLE for the mean is the arithmetic mean of our sample if and only if the sampling distribution of the mean is Normal

I’m really struggling to see the intuition behind this, can anyone help out? Thanks!

Original Q&A

There are 1 best solutions below

Bumbble Comm On 25 Dec 2020 - 4:01

You're referring to an argument Jaynes attributes to Gauss in Sec. 7.3. The log-likelihood of $n+1$ observations $\{x_i|0\le i\le n\}$ when the estimator is unbiased is of the form $\sum_i\ln f(x_i|\theta)$, which we want to maximise when $\theta=\bar{x}$.

Jaynes actually adds an assumption, which as @paulinho notes is inapplicable for the Bernoulli distribution, namely that the likelihood depends only on the $\theta-x_i$. If this seems like sleight of hand, bear in mind the point of his Chapter 7 is only to motivate why sampling errors would be Gaussian. (In particular, sampling means being approximately Gaussian for large samples is a completely unrelated issue!) It is not an "everything is Gaussian" argument. (In fact, Sec. 7.12 explores why sometimes even errors would be non-Gaussian.) The assumption we're adding is reasonable for sampling errors.

Compare $\sum_i(\bar{x}-x_i)=0$ to $\left.\sum_i\frac1f\frac{\partial f}{\partial\theta}\right|_{\theta=\bar{x}}=0$. For $\frac1f\frac{\partial f}{\partial\theta}$ to be proportional to $\theta-x$ (which is equivalent to $\ln f$ being quadratic in $\theta-x$ and peaked at $x=\theta$, making $\theta$ the mode of a Gaussian distribution) is clearly sufficient, and the case $x_i=-\tfrac1nx_n$ for all $i<n$, when $n\ge1$, proves necessity.

As for an intuition rather than a proof, note that if $\frac1f\frac{\partial f}{\partial\theta}$ is nonlinear you should be able to tweak initially constant $x_i$, without changing $\bar{x}$, so that $\frac1f\frac{\partial f}{\partial\theta}$ changes in value. The above example shows you can; see Jaynes for the full proof.

Why does knowing that the MLE of the mean = the arithmetic mean of the sample uniquely determine the Normal sampling distribution?

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in NORMAL-DISTRIBUTION

Related Questions in MAXIMUM-LIKELIHOOD

Related Questions in GAUSSIAN

Trending Questions

Popular # Hahtags

Popular Questions