I'm reading the existence proof of singular value decomposition.
It considers $f:\mathbb{R}^n\to \mathbb{R}, f(x) = x^TMx$. It talks about the gradient of $f$ and make it equal to a multiple of the gradient of $x^tx$. I suppose that it's because the constraint is the unit sphere, so that's why it made $x^tx = x_1^2 + \cdots x_n^2$, right?
I'm trying to understand this so I took $f$ with a generic matrix $M$
$$f(x) =\begin{bmatrix} x_1 & \cdots & x_n \end{bmatrix}\begin{bmatrix} a_{11} & a_{12} & \dots \\ \vdots & \ddots & \\ a_{n1} & & a_{nn} \end{bmatrix}\begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \\ x_1(a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n) + \\x_2 (a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n) + \\ \cdots + \\x_n(a_{n1}x_1+a_{n2}x_2 + \cdots + a_{nn}x_n)$$
Taking the partials to construct the gradient vector, I can see that I'll end up with:
$$\begin{bmatrix} 2a_{11}x_1 + a_{21} + \cdots a_{n1} \\ a_{12} + 2a_2x_2 + \cdots + a_{n2} \\ \vdots \\ a_{1n} + a_{2n}\cdots + 2a_{nn}x_n\\ \end{bmatrix} $$
Now, I need to equal this with $\lambda$ gradient of $x^tx$: $$\begin{bmatrix} 2x_1 \\ 2x_2 \\ \vdots \\ 2x_n\\ \end{bmatrix}$$
so:
$$\begin{bmatrix} 2a_{11}x_1 + a_{21} + \cdots a_{n1} \\ a_{12} + 2a_2x_2 + \cdots + a_{n2} \\ \vdots \\ a_{1n} + a_{2n}\cdots + 2a_{nn}x_n\\ \end{bmatrix} = \lambda \begin{bmatrix} 2x_1 \\ 2x_2 \\ \vdots \\ 2x_n\\ \end{bmatrix} $$
As an example, the first line becomes:
$2a_{11}x_1 + a_{21} + \cdots a_{n1} = \lambda 2x_1 \implies \lambda 2x_1 -2a_{11}x_1 = a_{21} + \cdots a_{n1}\implies x_1(2\lambda - 2a_{11}) = a_{21} + \cdots a_{n1}$
What should I do now? It says that I should end up with $Mu = \lambda u$
Also, is there a more elegant way of calculating the gradients or it's just all this mess?
You made a mistake computing the gradient of
$$\\ x_1(a_{11}x_1 + a_{12}x_2 + \cdots + a_{1n}x_n) + \\x_2 (a_{21}x_1 + a_{22}x_2 + \cdots + a_{2n}x_n) + \\ \vdots \\x_n(a_{n1}x_1+a_{n2}x_2 + \cdots + a_{nn}x_n)$$ For example, the partial derivative of this with respect to $x_1$ is $$ \underbrace{2a_{11}x_1+a_{12}x_2+\dots+a_{1n}x_n}_{\text{first row}}+\underbrace{a_{21}x_2+\dots+a_{n1}x_n}_{\text{first term of remaining rows}} $$ which you can recognize as the first entry of $(M+M^\top)x$. Therefore, the Lagrange multiplier equation becomes $$ (M+M^\top)x=2\lambda x $$ If you are further given that $M$ is symmetric, this implies $Mx=\lambda x$.