I'm studying multivariable calculus. Usually, when I study, I go through a book until I find a theorem, and then try to prove it. I was unable to, so I read the proof, which is the following:
Let $x, y \in \mathbb{R}^m, \alpha \in \mathbb{R}$. Then $(x+\alpha y)\cdot(x+\alpha y) = \vert \vert x+\alpha y\vert\vert^2 \geq0$. Using the properties a the inner product we get:
$(x+\alpha y)\cdot(x+\alpha y) = x\cdot x+\alpha x\cdot y + \alpha y\cdot x + \alpha^2y\cdot y = \vert\vert x\vert\vert^2+2(x\cdot y)\alpha + \alpha^2\vert\vert y\vert\vert^2 \geq 0$.
That last inequality is true iff the discriminant of the polynomial with respect to $\alpha$ is less than or equal to 0. Therefore $\vert x\cdot y\vert - \vert \vert x\vert\vert^2\vert\vert y\vert\vert^2 \leq 0$, from which comes the Cauchy-Schwarz inequality. Q.E.D
I can follow every step of the proof. I also get the intuition of why the inequality should be true. However, the proof seems "empty" to me. I don't understand what someone who wanted to prove this would do to find it. What's the intuition behind using $x+\alpha y$?
The reason I ask this is because, after I read the proof, the way used to prove it was so beyond everything that I tried, that I am almost sure that I'd never be able to prove this on my own. How to deal with these kind of situations?
I don't know about anybody else, but I share your dissatisfaction with the standard slick proof, and I personally find it helpful to think instead of expressing $x$ as a sum of a multiple of $y$ and a vector orthogonal to $y$. This kind of resolution of a vector into two mutually orthogonal components is a common and natural operation.
If $\lambda$ is real, then $x - \lambda y$ is orthogonal to $y$ if and only if (in your notation) $(x - \lambda y) \cdot y = 0$, i.e., $$ \lambda \|y\|^2 = x \cdot y. $$
For any value of $\lambda$ satisfying that condition ($\lambda$ may be chosen arbitrarily if $y = 0$, and there is a unique solution for $\lambda$ if $y \ne 0$), write $u = x - \lambda y$ and $v = \lambda y$, so that $x = u + v$ and $u \cdot v = 0$. Then: \begin{align*} \|x\|^2 & = (u + v) \cdot (u + v) \\ & = u \cdot u + 2u \cdot v + v \cdot v \\ & = \|u\|^2 + \|v\|^2 \\ & \geqslant \|v\|^2. \end{align*} Therefore, using the definitions of $v$ and $\lambda$: $$ \|x\|^2\|y\|^2 \geqslant \|v\|^2\|y\|^2 = \lambda^2\|y\|^4 = (x \cdot y)^2 = |x \cdot y|^2, $$ and the result follows. So the selection of the value $-\lambda$ for $\alpha$ does make some intuitive sense (to me, at least).
You could arrive at this value of $\alpha$ less intuitively by "completing the square" in the expression you derived for $\|x + \alpha y\|^2$, thus, multiplying by $\|y\|^2$, to avoid a possible division by zero: \begin{align*} \|x + \alpha y\|^2\|y\|^2 & = \|x\|^2\|y\|^2 + 2(x \cdot y)\alpha\|y\|^2 + \alpha^2\|y\|^4 \\ & = (\alpha\|y\|^2 + x \cdot y)^2 + \|x\|^2\|y\|^2 - (x \cdot y)^2 \\ & = \|x\|^2\|y\|^2 - (x \cdot y)^2, \end{align*} if $$\alpha\|y\|^2 + x \cdot y = 0. $$ So the proof you quoted can be seen as the proof by resolution into orthogonal components in heavy disguise.