(Sorry I had to post the images as links. I don't have enough cred to post pictures directly yet)
I'm trying to understand what the least squares solution to an overdetermined system means geometrically in the case of the following system:
$$ y = -2x-1\\ y = 3x -2\\ y = x+1\\ $$
rewritten in matrix form:
$$ \overbrace{\begin{bmatrix} 2 & 1\\ -3 & 1\\ -1 & 1 \end{bmatrix}}^A \underbrace{\begin{bmatrix} x\\ y \end{bmatrix}}_x = \overbrace{\begin{bmatrix} -1\\ -2\\ 1 \end{bmatrix}}^b $$
Using A\b in MATLAB, you get the solution $\begin{bmatrix}0.1316 & -0.5789\end{bmatrix}^T$. I know that MATLAB returns the lowest norm solution of a least squares problem. I have plotted the system here and the green dot in the middle is this least squares solution.
Now, correct me if I'm wrong, but (in this 2D case) the least squares solution minimizes the "distance" from the solution to each line. I can geometrically calculate the distance of a point $(x_0,y_0)$ from a line $ax + by + c = 0$ as follows:
$$\mathrm{distance} = \frac{|ax_0 + by_0 + c|}{\sqrt{a^2 + b^2}}$$
and doing that for each line produces the following sum of squared distances function
dfun = @(x,y) ((y+2*x+1).^2)/(1^2 + 2^2) + ((y+3*x+2).^2)/(1^2 + 3^2) + ((y+x-1).^2)/(1^2 + 1^2);
If I generate a surface using this function over a range of $x$ and $y$ values, I get this surface with this top-down view (looking down the z-axis on the xy plane). You can download the MATLAB .fig file here if you want to zoom and pan (requires MATLAB, link expires in 30 days).
Here is an image showing the least squares solution with the sum of squares of distances of the solution and its norm. As can be seen, the norm is $0.5937$ and the distance is $1.4704$. But clearly, there is a contour that has a lower sun of squared distance in the image, as shown here for $(x_0, y_0) = (-0.3,0)$, where the norm and the sum of squared distances are both smaller. Shouldn't this (or another point) be a better least squares solution? Do I have the wrong intuition about what least squares is doing here?
After reading Claude Leibovici's answer above, I realized that my
dfunhad typos in it -- I messed up a couple of minus signs in the function.Additionally, the norm typically used for least squares calculations (also used by MATLAB) is the $l^2$-norm (a.k.a Euclidean norm):
$$ x = \begin{bmatrix}x_1\\ x_2\\ \vdots\\ x_n\end{bmatrix}, ||x|| = \sqrt{x_1^2 + x_2^2 + \dots + x_n^2}$$
Note that there is no scaling of the "distance" like there is in my
dfun. Therefore correct function should be:After fixing these mistakes, here is the surface that is generated, with this top-down view. As can be seen here, the solution of $\begin{bmatrix}0.1316 & -0.5789\end{bmatrix}^T$ is, in fact, correct and confirms the original intuition I was trying to confirm.