Let $A$ a $m \times n$ matrix, $b$ a $m \times 1$ matrix and $x$ a $n \times 1$ matrix. Consider $f: \mathbb{R}^n \to \mathbb{R}$ defined by $$f(x) = \Vert Ax - b \Vert^2.$$ Determine a condition that characterizes the critical points of $f$.
My attempt. Note that $$\Vert Ax - b \Vert^{2} = (Ax - b)^{T}(Ax - b) = (x^{T}A^{T} - b^{T})(Ax - b) = x^{T}A^{T}Ax - x^{T}A^{T}b - b^{T}Ax + b^{T}b.$$ Thus, $$\frac{\partial}{\partial x_{i}}f(x) = 2A^{T}Ax_{i} - A^{T}b - \color{red}{b^{T}A}.$$ But I know that $\nabla f(x) = 2A^{T}Ax - 2A^{T}b$, so "$b^{T}A$ should be $A^{T}b$". How do I correct this?
Assuming that $\nabla f(x) = 2A^{T}Ax - 2A^{T}b = 2A^{T}(Ax - b)$, the critical points of $f$ are the solutions of $Ax - b$. I dont understand what it means to characterize the critical points, but I suppose it's about maximum and minimum.
Let $H: \mathbb{R}^n \to \mathbb{R}$ a quadratic form given by $Hv^{2} = \sum h_{ij}\alpha_{i}\alpha_{j}$. The quadric form of $H$ is positive when $Hv^{2} > 0$ for all $v \neq 0$ in $\mathbb{R}^n$ and is negative when $Hv^{2} < 0$ for all $v \neq 0$ in $\mathbb{R}^n$, otherwise we say that $Hv^{2}$ is undefined.
Theorem. Let $f: U \to \mathbb{R}$ a $C^{2}$ function, $p \in U$ a critical point of $f$ and $H$ the Hessian quadratic form of $f$ in $p$. Then
(i) If $H$ is positive, $p$ is a local minimum point (non-degenerated).
(ii) If $H$ is negative, $p$ is a maximum local point (non-degenerated).
(iii) If $H$ is undefined, $p$ is not a local maximum or minimum point.
The Hessian of $f$ is $\nabla^{2} f(x) = 2A^{T}A = 2\Vert A \Vert^2$. But, the Hessian is independent of the point and $2\Vert A \Vert^2$ is non-negative. So the quadratic form is always non-negative? This is the first one I try about Hessian and critical points so, I dont have much practice.
I appreciate any help!
The partial derivative should be $$\frac{\partial}{\partial x_i} f(x) = 2 A^\top A x_i - (A^\top b)_i - (A^\top b)_i$$ where $(A^\top b)_i$ denotes the $i$th component of the vector $A^\top b$.
"Determine a condition that characterizes the critical points" is simply asking for the condition $A^\top (Ax - b) = 0$. That's it! :) [That is, if someone asked you "how can you tell me if this $x$ is a critical point?" you can check whether $A^\top (Ax - b) = 0$ holds or not.]
More importantly, the above condition is not equivalent to $Ax-b=0$. If $Ax-b=0$ it is then true that $A^\top (Ax-b) = 0$, but the converse may not hold, in particular if $A^\top$ has a nontrivial nullspace.
I would not write $A^\top A = \|A\|^2$ since $A$ is a matrix, not a vector. But indeed the Hessian is $H = 2 A^\top A$ which is a positive semi-definite matrix, i.e. it satisfies $v^\top H v \ge 0$ for any vector $v$. Thus any critical point is a local minimum.