Least Squares Solution Confusion

318 Views Asked by At

Say if I have an overdetermined system $A\vec x=\vec b$, I can use the normal equations $\implies$ $A^TA\vec x=A^T\vec b$.

If I solve for $\vec x$ I will get a "solution" with an error. It says in my notes that this is the 'least squares solution' i.e it is the choice of $\vec x $which minimises the error. But this is the only choice of $\vec x$ which solves the normal equations, so how can it be the least squares solution when there is only one solution?

3

There are 3 best solutions below

0
On

You're confusing two things: the set of solutions to the normal equations, and the set of vectors over which you are searching for the least squared error. The second set is just the set of all vectors $x$ in whatever vector space you're working on. The squared error varies as you choose different vectors from the second set. The goal of the least squares problem is to find the vector(s) that minimize the squared error.

The claim is that if you get a solution to the normal equations, then it also solves the least square problem, in the following sense: if you chose any other vector from the second set besides the solution to the normal equations, you would get a squared error at least as large as the squared error you get using the solution to the normal equations.

0
On

For a given $b$, the equation $A^{T}Ax=A^{T}b$ always has a solution $x$, but this solution may not be unique. In fact, $x$ is unique iff $\mathcal{N}(A)=\{0\}$. If both $x_{1}$ and $x_{2}$ are solutions, then $Ax_{1}=Ax_{2}$ and $$ |Ax_{1}-b|^{2}=|Ax_{2}-b|^{2}, $$ which means that both $x_{1}$ and $x_{2}$ minimize the square distance $d(x)=|Ax-b|^{2}$.

You said that the original system $Ax=b$ was over-determined; I take that to mean that $b \notin\mathcal{R}(A)$, which is another way of saying that the "error" $|Ax-b|$ is going to be non-zero. It seems to me that nothing you stated in your problem precludes the possibility that $\mathcal{N}(A)\ne \{0\}$, which is a separate issue. If $\mathcal{N}(A)\ne \{0\}$, then $A^{T}Ax=A^{T}b$ will not have a unique solution, even though it has a solution.

0
On

The answer by @TrialAndError is a nice discussion and is amplified with a toy example.

$$ \begin{align} A x &= b \\ \left[ \begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array} \right] \left[ \begin{array}{cc} x_{1} \\ x_{2} \end{array} \right] &= \left[ \begin{array}{cc} b_{1} \\ b_{2} \end{array} \right] \end{align} $$

The least squares solution is $$ x_{LS} = \left[ \begin{array}{cc} b_{1} \\ 0 \end{array} \right] + \alpha \left[ \begin{array}{cc} 0 \\ 1 \end{array} \right] $$ where $\alpha$ is an arbitrary constant.

The sum of the squares of the residuals $$ r^{2}(x) = \lVert A x_{LS} - b \rVert_{2}^{2} = \left| b_{1} \right|^{2}, $$ is a constant independent of $\alpha$. Therefore the least squares solution here is an affine space, a vertical line at $x_{1}=b_{1}$.

Which value of $\alpha$ describes the solution of minimum norm? By the Pythagorean theorem, the norm of the least squares minimizer $$ \lVert x_{LS} \rVert_{2}^{2} = \bigg\lVert \left[ \begin{array}{cc} b_{1} \\ 0 \end{array} \right] + \alpha \left[ \begin{array}{cc} 0 \\ 1 \end{array} \right] \bigg\rVert_{2}^{2} = \bigg\lVert \left[ \begin{array}{cc} b_{1} \\ 0 \end{array} \right] \bigg\rVert_{2}^{2} + \bigg\lVert \alpha \left[ \begin{array}{cc} 0 \\ 1 \end{array} \right] \bigg\rVert_{2}^{2} = \left| b_{1} \right|^{2} + \left| \alpha \right|^{2}. $$ How do we adjust $\alpha$ to find the particular solution, the solution of minimum norm? Set $\alpha=0$.