Linear least squares in $\mathbb{R}^{3}$ with three data points.

380 Views Asked by At

Given data points in the form $(x,y,f(x,y)) = (1,1,7),(1,3,0),(1,-1,8)$, find the least squares solution $\hat x$ to the system of equations $Ax=b$.

Is there enough information to use least squares? The solution created $A$ using the $x$ and $y$ coordinates and $b$ using $f(x,y)$. I do not understand this problem. How can you solve for $\hat x$ when you do not know anything about the function? Isn't it presumptuous to assume a linear relationship?

2

There are 2 best solutions below

0
On
  1. Yes, you got enough information to construct the projection matrix $H$, $$ H=X(X'X)^{-1}X', $$ where $ X=\begin{pmatrix} 1,\,\, 1\\ 1,\,\, 3\\ 1,-1 \end{pmatrix} $, then $Hy=\hat{y}$ where $y=(7,0,8)'$, gives you the least square solution which is $(5,1,9)'$.

  2. You do not necessarily assume linear relationship between $y$ and $x$, you may view the least square solution as a linear approximation of $f(y,x)$ based on $x$. So, basically your assumption is of more technical nature like estimability of the linear (affine) approximation of $y$ using $x$.

0
On

We are given a sequence of measurements $\left\{ x_{k}, y_{k}, f_{k} \right\}_{k=1}^{3}$. Generally, we could hope to find three fit parameters, a trial function like $$ f(x,y) = c_{0}g_{0}(x,y) + c_{1}g_{1}(x,y) + c_{2}g_{2}(x,y) $$ where the functions $g$ are linearly independent.

You asked about the linear case where $$ f(x,y) = c_{0} + c_{1}x + c_{2}y. $$ That problem looks like this $$ \begin{align} \mathbf{A} c &= F \\[5pt] % \left[ \begin{array}{ccc} 1 & x_{1} & y_{1} \\ 1 & x_{2} & y_{2} \\ 1 & x_{3} & y_{3} \end{array} \right] \left[ \begin{array}{c} c_{0} \\ c_{1} \\ c_{2} \end{array} \right] &= \left[ \begin{array}{c} f_{1} \\ f_{2} \\ f_{3} \end{array} \right] \\[5pt] % \left[ \begin{array}{ccr} 1 & 1 & 1 \\ 1 & 1 & 3 \\ 1 & 1 & -1 \end{array} \right] \left[ \begin{array}{c} c_{0} \\ c_{1} \\ c_{2} \end{array} \right] &= \left[ \begin{array}{c} 7 \\ 0 \\ 7 \end{array} \right] \end{align} $$ The is no solution vector $c$ which satisfies this equation, so instead of asking that $\mathbf{A}c-F = 0$, we ask instead that $\mathbf{A}c-F$ be as small as possible.To measure length, we must select a norm, and the choice here is the $2-$norm of least squares.

The least squares solution is $$ c_{LS} = \mathbf{A}^{\dagger}F + \left(\mathbf{I}_{3} - \mathbf{A}^{\dagger}\mathbf{A} \right) y, \quad y\in\mathbb{C}^{3} $$

There is a problem with this model in that the first two column vectors of $\mathbf{A}$ are identical. This matrix has rank $\rho = 2.$ We can't use the normal equations.

Using the SVD, $$ \mathbf{A}^{\dagger} = \frac{1}{24} \left[ \begin{array}{ccr} 4 & 1 & 7 \\ 4 & 1 & 7 \\ 0 & 6 & -6 \\ \end{array} \right] $$ and the full solution is $$ c_{LS} = \frac{1}{24} \left[ \begin{array}{r} 77 \\ 77 \\ -42 \end{array} \right] + \alpha \left[ \begin{array}{r} -1 \\ 1 \\ 0 \end{array} \right]. $$

The least squares fit is the best fit, but it may not be a good fit. That is, given the trial function, the method will find the best parameters. But the trial function may be bad.

The residual error vector $$ r(c_{LS}) = \frac{1}{6} \left[ \begin{array}{r} -14 \\ 7 \\ 7 \end{array} \right]. $$ The total error is $r\cdot r = \frac{49}{6} \approx 8.2.$