Proof of Cauchy-Schwartz inequality with dot product and euclidean norm

339 Views Asked by At

I have some problems on understanding the proof of Cauchy-Schwartz inequality from my textbook:

Given $\textbf{x,y} \in \mathbb{R} \Rightarrow \vert \textbf{x}^T \textbf{y} \vert \le \Vert \textbf{x} \Vert_2 \cdot \Vert \textbf{y} \Vert_2$

the proof:

given $\lambda \in \mathbb{R}$ we can observe that : $$\begin{align} 0 \le \Vert \textbf{x} + \lambda \textbf{y} \Vert_2^2 & = \sum_{i=1}^n (x_i + \lambda y_i)^2 \\ & = \sum_{i=1}^n (x_i^2 + 2 \lambda x_i y_i + \lambda^2 y_i^2) \\ & = \sum_{i=1}^n x_i^2 + 2\lambda \sum_{i=1}^n x_iy_i + \lambda^2 \sum_{i=1}^n y_i^2 \\ & = \Vert \textbf{x} \Vert^2_2 + 2 \lambda \textbf{x}^T \textbf{y} + \vert \lambda \vert^2 \Vert \textbf{y} \Vert^2_2 \end{align}$$ if $\textbf{x}^T\textbf{y} = 0$ then the thesis is surely true.
Instead, if $\textbf{x}^T\textbf{y} \ne 0$, then we can consider: $$\lambda = - \frac{\Vert \textbf{x} \Vert^2_2}{\textbf{x}^T\textbf{y}}$$ therefore we have: $$\begin{align} 0 \le \Vert \textbf{x} \Vert^2_2 - 2\Vert \textbf{x} \Vert^2_2 + \frac{\Vert \textbf{x} \Vert^4_2}{\vert \textbf{x}^T \textbf{y} \vert^2} \Vert \textbf{y} \Vert^2_2 & = -\Vert \textbf{x} \Vert^2_2 \frac{\Vert \textbf{x} \Vert^4_2}{\vert \textbf{x}^T \textbf{y} \vert^2} \Vert \textbf{y} \Vert^2_2 \\ & = \Vert \textbf{x} \Vert^2_2 \left ( -1 + \frac{\Vert \textbf{x}\Vert^2_2 \Vert \textbf{y}\Vert^2_2 }{ \vert \textbf{x}^T \textbf{y} \vert^2 }\right ) \end{align}$$ we can deduce that : $$\Vert \textbf{x} \Vert^2_2 \Vert \textbf{y} \Vert^2_2 - \vert \textbf{x}^T \textbf{y} \vert^2 \ge 0 $$ and then the thesis follows easily.


There are some questions I want to post here:

1) why immediately is it observed that $0 \le \Vert \textbf{x} + \lambda \textbf{y} \Vert_2^2$, from where we can deduce that observation?

2) when it says << if $\textbf{x}^T\textbf{y} = 0$ then the thesis is surely true. >> for "thesis", does it mean the expression of above proposition?:

$$\vert \textbf{x}^T \textbf{y} \vert \le \Vert \textbf{x} \Vert_2 \Vert \textbf{y} \Vert_2$$

but if I substitute in the last passage:

$$\begin{align}& = \Vert \textbf{x} \Vert^2_2 + 2 \lambda \textbf{x}^T \textbf{y} + \vert \lambda \vert^2 \Vert \textbf{y} \Vert^2_2 \\ & = \Vert \textbf{x} \Vert^2_2 + 2 \lambda 0 + \vert \lambda \vert^2 \Vert \textbf{y} \Vert^2_2 \\ & = \Vert \textbf{x} \Vert^2_2 + \vert \lambda \vert^2 \Vert \textbf{y} \Vert^2_2 \end{align}$$

I do not obtain the $\Vert \textbf{x} \Vert_2 \Vert \textbf{x} \Vert_2$ of the thesis of the proposition.

3) why considering the $\lambda$ in that way?

Instead, if $\textbf{x}^T\textbf{y} \ne 0$, then we can consider $$\lambda = - \frac{\Vert \textbf{x} \Vert^2_2}{\textbf{x}^T\textbf{y}}$$

Please, can you help me to understand better? Many thanks!

2

There are 2 best solutions below

2
On BEST ANSWER

(1) This is from the definition of a norm; namely, they are nondegenerate. (2) The RHS is non-negative (again by nondegeneracy) and so if the LHS is 0, then the inequality must be satisfied. (3) This choice of $\lambda$ makes the arithmetic come out like you want. He proves a general inequality and then picks a specific $\lambda$ so that the more general inequality will reduce to the desired inequality.

0
On
  1. It follows from the fact the a norm is always greater than or equal to $0$.
  2. The statement of the Cauchy-Schwarz equality is not what you wrote. It is $\lvert x^Ty\rvert\leqslant\lVert x\rVert_2\lVert y\rVert_2$. This obviously holds if $x^Ty=0$.
  3. I suppose that my previous reamrk explains this too.