Where am I going wrong in calculating the projection of a vector onto a subspace?

130 Views Asked by At

I am currently working my way through Poole's Linear Algebra, 4th Edition, and I am hitting a bit of a wall in regards to a particular example in the chapter on least squares solutions. The line $y=a+bx$ that "best fits" the data points $(1,2)$, $(2,2)$, and $(3,4)$ can be related to the (inconsistent) system of linear equations $$a+b=2$$ $$a+2b=2$$ $$a+3b=4$$ with matrix representation $$A\mathbf{x}=\begin{bmatrix}1&1\\1&2\\1&3\\\end{bmatrix}\begin{bmatrix}a\\b\\\end{bmatrix}=\begin{bmatrix}2\\2\\4\\\end{bmatrix}=\mathbf{b}$$ Using the least squares theorem, Poole shows that the least squares solution of the system is $$\overline{\mathbf{x}}=\left(A^T A \right)^{-1} A^T \mathbf{b}=\left(\begin{bmatrix}3&6\\6&14\\\end{bmatrix}\right)^{-1}\begin{bmatrix}8\\18\\\end{bmatrix}=\begin{bmatrix}\frac{7}{3}&-1\\-1&\frac{1}{2}\\\end{bmatrix}\begin{bmatrix}8\\18\\\end{bmatrix}=\begin{bmatrix} \frac{2}{3}\\1\\\end{bmatrix}$$ so that the desired line has the equation $y=a+bx=\frac{2}{3} +x$. The components of the vector $\overline{\mathbf{x}}$ can also be interpreted as the coefficients of the columns of $A$ in the linear combination of the columns of $A$ that produces the projection of $\mathbf{b}$ onto the column space of $A$ [which the Best Approximation Theorem identifies as the best approximation to $\mathbf{b}$ in the subspace $\mathrm{col}(A)$]. In other words, the projection of $\mathbf{b}$ onto $\mathrm{col}(A)$ can be found from the coefficients of $\overline{\mathbf{x}}$ by $$\mathrm{proj}_{\mathrm{col}(A)}(\mathbf{b})=\frac{2}{3}\begin{bmatrix}1\\1\\1\\\end{bmatrix}+1\begin{bmatrix}1\\2\\3\\\end{bmatrix}=\begin{bmatrix}\frac{5}{3}\\\frac{8}{3}\\\frac{11}{3}\\\end{bmatrix}$$ But when I try to calculate $\mathrm{proj}_{\mathrm{col}(A)}(\mathbf{b})$ directly [taking $\mathbf{a}_{1}$ and $\mathbf{a}_{2}$ to be the first and second columns of $A$, respectively], I get $$\mathrm{proj}_{\mathrm{col}(A)}(\mathbf{b})=\left(\frac{\mathbf{a}_{1}\cdot\mathbf{b}}{\mathbf{a}_{1}\cdot\mathbf{a}_{1}}\right)\mathbf{a}_{1}+\left(\frac{\mathbf{a}_{2}\cdot\mathbf{b}}{\mathbf{a}_{2}\cdot\mathbf{a}_{2}}\right)\mathbf{a}_{2}=\left(\frac{\begin{bmatrix}1\\1\\1\\\end{bmatrix}\cdot\begin{bmatrix}2\\2\\4\\\end{bmatrix}}{\begin{bmatrix}1\\1\\1\\\end{bmatrix}\cdot\begin{bmatrix}1\\1\\1\\\end{bmatrix}}\right)\begin{bmatrix}1\\1\\1\\\end{bmatrix}+\left(\frac{\begin{bmatrix}1\\2\\3\\\end{bmatrix}\cdot\begin{bmatrix}2\\2\\4\\\end{bmatrix}}{\begin{bmatrix}1\\2\\3\\\end{bmatrix}\cdot\begin{bmatrix}1\\2\\3\\\end{bmatrix}}\right)\begin{bmatrix}1\\2\\3\\\end{bmatrix}$$ $$=\frac{8}{3}\begin{bmatrix}1\\1\\1\\\end{bmatrix}+\frac{18}{14}\begin{bmatrix}1\\2\\3\\\end{bmatrix}=\begin{bmatrix}\frac{8}{3}\\\frac{8}{3}\\\frac{8}{3}\\\end{bmatrix}+\begin{bmatrix}\frac{9}{7}\\\frac{18}{7}\\\frac{27}{7}\\\end{bmatrix}=\begin{bmatrix}\frac{83}{21}\\\frac{110}{21}\\\frac{137}{21}\\\end{bmatrix}$$ I am quite confident that my calculation is incorrect, for a number of reasons. For example, when I take the component of $\mathbf{b}$ orthogonal to $\mathrm{col}(A)$ $$\mathrm{perp}_{\mathrm{col}(A)}(\mathbf{b})=\mathbf{b}-\mathrm{proj}_{\mathrm{col}(A)}(\mathbf{b})=\begin{bmatrix}2\\2\\4\\\end{bmatrix}-\begin{bmatrix}\frac{83}{21}\\\frac{110}{21}\\\frac{137}{21}\\\end{bmatrix}=\begin{bmatrix}-\frac{41}{21}\\-\frac{68}{21}\\-\frac{53}{21}\\\end{bmatrix}$$ I get a vector that is not perpendicular to either $\mathbf{a}_{1}$ or $\mathbf{a}_{2}$, indicating that this vector is not in the orthogonal complement of $\mathrm{col}(A)$. Can somebody help me identify where I'm going wrong in my attempt to calculate the projection of $\mathbf{b}$ onto $\mathrm{col}(A)$?

2

There are 2 best solutions below

0
On BEST ANSWER

The column space of $A$, namely $U$, is the span of the vectors $\mathbf{a_1}:=(1,1,1)$ and $\mathbf{a_2}:=(1,2,3)$ in $\Bbb R ^3$, and for $\mathbf{b}:=(2,2,4)$ you want to calculate the orthogonal projection of $\mathbf{b}$ in $U$; this is done by $$ \operatorname{proj}_U \mathbf{b}=\langle \mathbf{b},\mathbf{e_1} \rangle \mathbf{e_1}+\langle \mathbf{b},\mathbf{e_2} \rangle \mathbf{e_2}\tag1 $$ where $\mathbf{e_1}$ and $\mathbf{e_2}$ is some orthonormal basis of $U$ and $\langle \mathbf{v},\mathbf{w} \rangle:=v_1w_1+v_2w_2+v_3 w_3$ is the Euclidean dot product in $\Bbb R ^3$, for $\mathbf{v}:=(v_1,v_2,v_3)$ and $\mathbf{w}:=(w_1,w_2,w_3)$ any vectors in $\Bbb R ^3$.

Then you only need to find an orthonormal basis of $U$; you can create one from $\mathbf{a_1}$ and $\mathbf{a_2}$ using the Gram-Schmidt procedure, that is $$ \mathbf{e_1}:=\frac{\mathbf{a_1}}{\|\mathbf{a_1}\|}\quad \text{ and }\quad \mathbf{e_2}:=\frac{\mathbf{a_2}-\langle \mathbf{a_2},\mathbf{e_1} \rangle \mathbf{e_1}}{\|\mathbf{a_2}-\langle \mathbf{a_2},\mathbf{e_1} \rangle \mathbf{e_1}\|}\tag2 $$ where $\|{\cdot}\|$ is the Euclidean norm in $\Bbb R ^3$, defined by $\|\mathbf{v}\|:=\sqrt{\langle \mathbf{v},\mathbf{v} \rangle}=\sqrt{v_1^2+v_2^2+v_3^2}$.

Your mistake is that you assumed that $$ \operatorname{proj}_U\mathbf{b}=\frac{\langle \mathbf{b},\mathbf{a_1} \rangle}{\|\mathbf{a_1}\|^2}\mathbf{a_1}+ \frac{\langle \mathbf{b},\mathbf{a_2} \rangle}{\|\mathbf{a_2}\|^2}\mathbf{a_2}\tag3 $$ however this is not true because $\mathbf{a_1}$ and $\mathbf{a_2}$ are not orthogonal.

0
On

Aaand I wasn’t using an orthogonal basis for the subspace. The columns of $A$ are linearly independent, which means a least squares solution exists, but they are not orthogonal, which explains why my calculation of the projection of the vector $\mathbf{b}$ onto the column space of $A$ yielded an incorrect result. Applying the Gram-Schmidt Method to the columns of $A$ produces an orthogonal basis for $\mathrm{col}(A)$, which can then be used to calculate the projection.