Here is a problem on Optimization on Least Mean Square with answer of my class of Machine learning
Consider the least mean squares problem: $$ \min_{x\in\mathbb{R}^n}\|Ax - b\|_2^2 $$
Suppose $A \in \mathbb{R}^{m×n}$ is a full rank matrix and $m \ge n$. Find the closed-form solution of the least mean squares problem.
Hint: If $A \in \mathbb{R}^{m×n}$ is a full rank matrix and $m \ge n$,then $A^\top A$ is a positive definite matrix.
Solution: Let us first expand the objective function: \begin{align*} \min_{x\in\mathbb{R}^n}\|Ax - b\|_2^2&=(Ax - b)^\top(Ax - b)\\&=x^\top A^\top Ax - x^\top A^\top b - b^\top Ax + b^\top b\\ &= x^\top A^\top Ax - 2x^\top A^\top b + b^\top b \end{align*} This is a convex function of $x$ and so to find the minimum we take the derivative and set it equal to zero: $$ \nabla_x( x^\top A^\top Ax - 2x^\top A^\top b + b^\top b) = 2 \top A^\top Ax - 2 A^\top b = 0 $$ We know that $A^\top A$ is positive definite and invertible. Solving the last equation for $x$ we have $x = (A^\top A)^{-1}A^\top b$.
My questions
- Given a matrix $A \in \mathbb{R}^{m×n}, m \ge n$, it is full column rank (and not simply full rank), right?
- In the expression $\nabla_x x^\top A^\top Ax$, how do we know if the result is $2x^\top A^\top A$ or $2A^\top Ax$?
Here's a reference for the term full rank:
In your case, it is full column rank when they describe it as full rank.
You can use either row convention or column convention as long as it is consistent.
If you write $2x^TA^TA$, you have to use $2x^TA^TA-b^TA=0^T$.
If you write $2A^TAx$, you have to use $2A^TAx-A^Tb=0$.