How can I prove that the mean of squared data points is greater than the square of the mean of the data points?

5.2k Views Asked by At

The inequality I'm trying to prove is

$$\frac{1}{n}\sum_{i=1}^nx_i^2 \geq \frac{1}{n^2}\left(\sum_{i=1}^nx_i\right)^2.$$

I tried simply expanding out the RHS but I can't seem to count the terms properly. I can reduce the inequality to

$$(n-1)\sum_{i=1}^nx_i^2 - 2\sum_{i\neq j}x_ix_j \geq 0$$

but there appear to be $2(n-1)!$ terms in this $i\neq j$ sum and only $n(n-1)$ terms in the sum of the squares, which doesn't seem right to me.

I've also "assumed" that $x_1 \leq\dots\leq x_n$ which allows me to set up the inequalities $x_1^2 \leq x_1x_2 \leq x_2^2$ etc. but my counting is letting me down.

Is there an easier way to do this?

1

There are 1 best solutions below

2
On BEST ANSWER

This is Jensen's inequality applied to the set $(x_1, \ldots, x_n)$ using the convex function $f(x) = x^2$.

It is also directly provable using the following elementary argument: define $\bar x = \frac{1}{n} \sum_{i=1}^n x_i$ to be the sample mean. Now we have $$ \begin{align*} \sum_{i=1}^n x_i^2 &= \sum_{i=1}^n (x_i - \bar x + \bar x)^2 \\ &= \sum_{i=1}^n \left( (x_i - \bar x)^2 + 2\bar x (x_i - \bar x) + \bar x^2 \right) \\ &= \sum_{i=1}^n (x_i - \bar x)^2 + 2 \bar x \left( -n \bar x + \sum_{i=1}^n x_i \right) + n \bar x^2 \\ &= n \bar x^2 + \sum_{i=1}^n (x_i - \bar x)^2 \\ &\ge n \bar x^2.\end{align*}$$ The second equality comes from expanding the square; the third comes from distributing the sum over the three terms in the summand; the fourth equality is a result of the fact that $n \bar x = \sum_{i=1}^n x_i$, thus the middle term equals zero; and the final inequality is a consequence of the fact that no real square is negative, thus the sum is bounded below by $0$. Now all that remains is to divide by $n$ to obtain $$\frac{1}{n} \sum_{i=1}^n x_i^2 \ge \bar x^2 = \left(\frac{1}{n} \sum_{i=1}^n x_i \right)^2,$$ as claimed.