I found the solution of Ax=b in the least squares but I don't understand how to use it

337 Views Asked by At

This is probably a very obvious question, but I don't understand it.

I know how to use the least squares method, which I did, and verified using an online calculator. My system is

enter image description here

("e" = and)

I solved x = $[1 , -1]^T$

I don't understand how this solves the problem. I know that if I multiply A by x I get b. What I don't understand is that, I know this method is applied when you have a bunch of points and want to find a line or function that goes through them, in an approximate way. How do I get a line from x? I know x has to do with the coefficients of a polynomial, and that A and b are the x and y coordinates of the points, so why do A has two columns? It's probably a really silly question but I'm very confused.

Thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

The least squares problem is $y=X\beta+\epsilon$, and the solution is given by $\beta=(X^TX)^{-1}X^Ty$.

Here, we want to use the least squares technique to find the solution to $b=Ax$. However, least squares is not needed here because the system is consistent, i.e. it has a solution at $(1,-1)$. You can still get the solution by solving $\operatorname{arg min}\limits_x\|b-Ax\|^2=(1,-1)$ and the error is $\|b-A(1,-1)\|^2=0$ because it's a solution.

It becomes more interesting if the equation is inconsistent, i.e. say $b=\left(\begin{matrix}2\\-\sqrt 2\\-3\end{matrix}\right)$. Then the least squares solution will be given by $x=(A^TA)A^Tb=(\frac 3 2,-1)$. This minimizes the Euclidean difference between b and Ax in $R^3$. Here, the difference is not 0, in fact it is $\|b-A(\frac 3 2,-1)\|^2=\|(1/2,0,1/2)\|^2=\frac 1 2$.

There are two ways to think about this

Way 1

Geometrically, the column space of A is a plane given by $\left(\begin{matrix}1\\-\sqrt 0\\-1\end{matrix}\right)x+\left(\begin{matrix}-1\\\sqrt 2\\1\end{matrix}\right)y$, which is a plane in $\mathbb R^3$. You seek a point (x, y) such that the perpendicular (Euclidean) distance between b and Ax is minimum. It's given by (3/2, -1) and the distnace is $1/2$.

Way 2

You've got three planes in $R^3$, $z_1=x-y, z_2=\sqrt 2 y, z_3=-x+y$. It looks like this where the origin is at the center:

enter image description here

The least squares solution is found at the (x,y) coordinate (i.e. moving around in the x-y plane) such that the difference between x-y and the plane z=2, plus the squared difference between $\sqrt 2 y$ and the plane z=$-\sqrt 2$, and the squared difference between -x+y and the plane z=-3 is minimized. You must imagine that there's a "pin" perpendicular to the xy plane and the position of the pin where these vertical distances are minimized is the least squares solution.

Conversely, if the system were consistent, in which case you wouldn't use least squares, the pin would precisely land where x-y=2, $\sqrt 2 y=-\sqrt 2$, and -x+y=-3.

0
On

You don't have a line here, but a plane.

The data points are $$\begin{align} &(1,-1,2)\\ &(0,\sqrt2,-\sqrt2)\\ &(-1,1,2) \end{align}$$ which we'll consider as being of the form $(x,y,z)$ and we want to fit an equation of the form $z=ax+by +\varepsilon$ where $\varepsilon$ represents some inaccuracy in measurement or other "noise". We stack these equations into matrix form: $$\begin{bmatrix}x_1&y1\\x_2&y_2\\x_3&y_3\end{bmatrix} \begin{bmatrix}a\\b\end{bmatrix}+\begin{bmatrix}\varepsilon_1\\\varepsilon_2\\\varepsilon_3\end{bmatrix}=\begin{bmatrix}z_1\\z_2\\z_3\end{bmatrix} $$ This is the same as your equation, except that the left- and right-hand sides are interchanged, and $\varepsilon$ is suppressed. We just want to minimize the sum of the squares of the elements of $$ \begin{bmatrix}z_1\\z_2\\z_3\end{bmatrix}-\begin{bmatrix}x_1&y1\\x_2&y_2\\x_3&y_3\end{bmatrix} \begin{bmatrix}a\\b\end{bmatrix},$$ so we really don't need to write $\varepsilon$ explicitly.

An equation of the form $z=ax+yb$ represents a plane though the origin. Of course, we can always find a plane passing through $3$ points, but it won't necessarily pass through the origin. We may have a "model" in mind that says that theoretically, the points will lie on a plane through the origin, and we take the point of view that the only reason they don't is because of unavoidable inaccuracies in measurement, or perhaps blunders. (If we had $100$ data points, and $99$ of them lay close to a plane through the origin, but the hundredth was way far away, we would might well believe that the deviation was due to an error in making the measurement, or in recording the result.)

If we had more data points, we would have more rows in the matrices, but the same number of columns, and we wouldn't except to be able to pass a plane through all the points. Then minimizing the sum of the squares of the individual errors, or some other expression for the error, becomes necessary. We might have more than two independent variables, and then the matrix would have more columns, and there would be more unknown coefficients.

"Linear" doesn't necessarily mean "line-like" in linear algebra. Also, the plane doesn't have to pass through the origin. Linear regression would handle a model of the form $$z=ax+by+c.$$

For more information, look at Wikipedia.