How can I prove that the mean of a probability distribution is a value that minimizes the variance of the distribution?

144 Views Asked by Bumbble Comm At 07 Apr 2026 - 3:22

According to Page 87, John_K_Kruschke-Doing_Bayesian_Data_Analysis-EN.pdf 2nd Edition, the author says that the mean of a distribution is a value that minimizes the variance of a probability distribution, for example, a normal distribution. The following is what is mentioned in the page:

"It turns out that the value of $ M $ that minimizes $ \int p(x)(x−M)^2 dx = E[X] $. In other words, the mean of the distribution is the value that minimizes the expected squared deviation. In this way, the mean is a central tendency of the distribution."

I have read the paragraph and kind of understood what the author is trying to say but I wonder how this can be written mathematically using the above equation. Hope to hear some explanations.

And why are we trying to use the mean to minimize the variance?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 18 Apr 2021 - 10:15 BEST ANSWER

$$ \frac{\partial}{\partial M}\int p(x)(x-M)^2dx = 0,\\ \int p(x)\frac{\partial (x-M)^2}{\partial M}dx = 0,\\ \int p(x)(2(M-x))dx = 0,\\ 2\int p(x)Mdx = 2\int p(x) xdx,\\ M\int p(x)dx = \int p(x) xdx,\\ M = \int p(x) xdx,\\ $$

Edit Explanation for non-math people

Imagine a simple distribution: $x=0$ with $p=0.2$ and $x=1$ with $p=0.8$. Let's take $M=0.5$ first, than the variance:

$$ \sum_x p(x)(x-M)^2 = 0.2\times 0.5^2 + 0.8\times 0.5^2 = 0.25 $$

What if we have increased $M$ by a tiny amount $0.001$? How much will it decrease the variance?

$$ \sum_x p(x)(x-M-0.001)^2 = \sum_x p(x)\left((x-M)^2-2(x-M)\times 0.001 + 0.001^2)\right) = \\ \sum_x p(x)(x-M)^2 + 2\times 0.001\times\sum_x p(x)(M-x) + 0.001^2\sum_x p(x)\\ = 0.25 + 2\times0.001\times(0.2\times0.5 - 0.8\times 0.5) + 10^{-6} $$

Thus, neglecting $10^{-6}$, which is way smaller than the second term, we cay that variance will decrease by the second term ($-0.0006$). We can keep increasing $M$ and the variance will decrease until the second term is no longer negative. This happens when this term is exactly zero. Hence $\int p(x)(x-M)dx$.

What we did here is called differentiation. And the reason why we did this, is because if the derivative (the slope) exists at points of minimum, it is zero.

How can I prove that the mean of a probability distribution is a value that minimizes the variance of the distribution?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in SELF-LEARNING

Related Questions in DISTRIBUTION-THEORY

Related Questions in MATHEMATICAL-MODELING

Trending Questions

Popular # Hahtags

Popular Questions