Proof of Equivalence of Least Squares Problem

46 Views Asked by At

Let $b \in \mathbb R^n $ and $A \in \mathbb R^{m\times n}$ with $m \geqslant n$ and $\operatorname{rank}(A)\le n$. Prove that the following statements are equivalent;

  1. $\hat{x} = \operatorname{argmin}_{x \in \mathbb R^n} \|Ax-b\|^2$

  2. $A^t(b-A \hat{x})=0$.


I know that; $$ \|Ax-b\|^2 = (Ax-b)^t(Ax-b) = x^tA^tAx-2x^tA^tb+b^2. $$ Which can simplify to; $$ A^t(Ax-b) =0 $$

However I am still really confused if this is right and provides a full detailed proof of the equivalent please!

2

There are 2 best solutions below

0
On

Hint: One approach is to calculate the derivative of $\| A x - b \|^2$, set it to zero and solve for $\hat{x}$.

0
On

\begin{align} & \|Ax-b\|^2 \\[8pt] = {} & \|(Ax-Ay) + (Ay - b)\|^2 \\[8pt] = {} & \|Ax-Ay\|^2 + \|Ay - b\|^2 + 2(Ax-Ay)^t (Ay - b) \\[8pt] = {} & \|Ax-Ay\|^2 + \|Ay - b\|^2 + 2(x-y)^t A^t (Ay - b) \end{align} If the vector $y$ is one for which $A^t(Ay-b)=0,$ then the last line above is $$ \|Ax-Ay\|^2 + \|Ay-b\|^2. $$ If $x=y$ then the first term above vanishes and thereby makes the sum as small as possible. If the columns of $A$ are linearly independent, then the only such value of $x$ is $y;$ otherwise $y$ is one value of the argmin and there are others.

In other words, any value of $y$ that makes $A^t(Ay-b)=0$ is a value of $\operatorname{argmin}_x \|Ax-b\|^2.$