What is the intuition behind the Cauchy-Schwarz inequality in the real numbers?

839 Views Asked by At

The Cauchy-Schwarz inequality states that

$$\left(\sum_{i=1}^n x_i y_i\right)^2\leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right).$$

The proof, with the discriminant argument, is easy to understand; however, it does not really (in my opinion) provide any intuitive justification as to why the inequality should be true.

Note that similar questions have been posted here and here; however, they do not help me because I have not yet studied linear algebra. For the same reason, an ideal answer would use only (high school) algebra and, if necessary, calculus.

3

There are 3 best solutions below

7
On BEST ANSWER

Here is the geometric intuition behind this for $n = 3$. You need not know much linear algebra, but you do need to know about vectors.

Given two vectors $x = (x_1,x_2,x_3)$ and $y = (y_1,y_2,y_3)$, we define their dot product $x \cdot y$ to be $|x||y|\cos \theta$, where $|x|$ and $|y|$ are the lengths of the vectors $x$ and $y$, and $\theta$ is the angle between them. (In theory, this definition may be slightly circular, since at an advanced level dot products are usually used to define angles. But if you accept the idea of an angle as intuitively meaningful, we needn't worry about this technicality.)

The dot product $x \cdot y$ is also given by the formula $x_1y_1 + x_2 y_2 + x_3 y_3$.

Then the Cauchy-Schwarz inequality is exactly equivalent to the statement that $|\cos \theta| \leq 1$.

An alternative interpretation without angles in general, but using perpendicularity, is the one given in A.S.'s comment.

If you'd like to see the details of this, have a look either at Chapter 12 of Apostol's Calculus or at Chapter 1 of Lang's Introduction to Linear Algebra.

Edit I can try to give a very imperfect algebraic "interpretation" of the inequality. I'm not convinced this is the best one, so I'll keep thinking about it.

If you look at the inequality $$\left(\sum_{i=1}^n x_i y_i\right)\left(\sum_{i=1}^n x_i y_i\right)\leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right),$$ note first that the general inequality follows from the special case where all the $x_i$'s and $y_i$'s are nonnegative, since $|\sum x_i y_i| \leq \sum |x_i||y_i|$. Next think about how the $x_i$'s and $y_i$'s match up. The inequality says that to make a product like the LHS and RHS as large as possible, it's better to match up the numbers $x_i$ and $y_i$ with themselves than with each other. This sort of makes sense, because if you have $x_1 < y_1$, and you go from $x_1 y_1$ (twice) to $x_1^2$ and $y_1^2$, you're better off with the latter. This is because $y_1^2$ is relatively large, and this is usually more than enough to compensate for the smaller $x_1^2$. Obviously, this is not a proof in any way. But it does make the inequality plausible.

2
On

We can check that $$\left(\sum_{i=1}^na_i^2\right)\left(\sum_{i=1}^nb_i^2\right) - \left(\sum_{i=1}^na_ib_i\right)^2 =\ \sum_{1\leqslant i<j\leqslant n}(a_ib_j-a_jb_i)^2 \geqslant 0 .$$

0
On

The following arguments are taken from the fantastic book "Cauchy-Schwarz Masterclass" written by Michael Steele.

At first, we take a look at a rather similar inequality which we will show to be very close to Cauchy-Schwarz.

For the case of $n=1$, we consider

$$\tag{elementary} xy \leq \frac{x^2}{2} + \frac{y^2}{2}.$$

If we replace - as Steele does on page 19f - $x$ and $y$ with its square roots and multiply by $4$, we arrive at

$$ 4 \sqrt{xy} < 2 \, (x + y), \quad \text{for all nonnegative } x \neq y.$$

The equality holds when $(x-y)^2 = 0$, which is excluded now.

Let us fix $a$ and $b$ as side lenghts of a rectangle such that $A = ab$. Considering all arbitrary side length $x$ and $y$ such that $A = ab = xy$, we might realise that the left hand side is the perimeter of the square with length $s = \sqrt{xy} = \sqrt{ab}$. The right hand side is the perimeter of all other possible rectangles with $x$ and $y$ as side lengths.

The generalisation from $xy$ to $x\cdot y$ is now very close. For $n=3$, we can interprete the statement that among all boxes in $\mathbb{R}^3$, the cube is the one with largest volume given a fixed surface area.

So, the elementary inequality given above is easy to interpret. But how to arrive at Cauchy-Schwarz?

First, Steele on page 5 adds up $n$-times the elementary inequality for $x_i$ and $y_i$ to arrive at

$$ \sum_{i=1}^n x_i \, y_i \leq \frac12 \, \left( \sum_{i=1}^n x_i^2 \, + \, \sum_{i=0}^n y_i^2 \right). \tag{additive}$$

If we take a look at normed vectors, namely $\tilde{x_i} = \frac{x_i}{\sqrt{\sum_{i=0}^n x_i^2}}$ and $\tilde{y_i} = \frac{y_i}{\sqrt{\sum_{i=0}^n y_i^2}}$, we are able to convert the additive bound into the Cauchy-Schwarz inequality as

$$\tag{CS} \frac{\sum_{i=1}^n x_i \, y_i}{\sqrt{\sum_{i=0}^n x_i^2}\,\sqrt{\sum_{i=0}^n y_i^2}}=\sum_{i=1}^n \tilde{x_i} \, \tilde{y_i} \leq \frac12 \, \left( \sum_{i=1}^n \tilde{x_i}^2 \, + \, \sum_{i=0}^n \tilde{y_i}^2 \right) = 1$$

So, as the Cauchy-Schwarz inequality is recovered from the elementary inequality for just very special choices of $x_i$ and $y_i$. But now we got a neat intuition at hand and we can imagine why Cauchy-Schwarz will appear pretty often - as it is closely related to the fundamental perimetrical inequality for boxes and cubes.