How to connect the normal equation to the Gram-Schmidt method of finding an orthogonal vector?

204 Views Asked by At

From this lecture, if given $Q$ a matrix where each column is orthogonal to each other, and $b$ which is some vector we want to project onto the column space of $Q$, the projection matrix is given by $QQ^T$ (as from the normal equations $Qx = b \to Qx=Q(Q^TQ)^{-1}Q^Tb= QQ^Tb$) so the vector orthogonal to the column space of $Q$ is $b – QQ^Tb$.

On page 3 of these notes, given some $v$ and an orthogonal basis $v_1, v_2,…v_n$ for some vector space $W$, the author uses Gram-Schmidt to find that $v-\sum_{j=1}^n \frac{<v,v_j>}{||v_j||^2}v_j$ is orthogonal to $W$.

If I define the columns of $Q$ to contain the orthogonal basis $[v_1 v_2 .... v_n]$, do the two methods give the same result? If so, is there a way to use the first method to arrive at the equation for the second?

1

There are 1 best solutions below

3
On BEST ANSWER

You might be having trouble showing the equivalence because you need additional assumptions for the projection matrix to be $QQ^T$ in the first place: it’s actually $Q(Q^TQ)^{-1}Q^T$. With the additional assumption that the columns of $Q$ are unit vectors, this reduces to $QQ^T$. If we remove this restriction, however, we’re no longer guaranteed that $Q^TQ$ is invertible: we need to add the condition that the columns of $Q$ are linearly independent. (Pairwise orthogonality of the columns isn’t enough since the zero vector is orthogonal to everything.) Equivalently, $Q$ must have full column rank.

Setting $Q=\begin{bmatrix}v_1&v_2&\cdots&v_n\end{bmatrix}$, we have by orthogonality of the columns $$Q^TQ = \begin{bmatrix}\langle v_1,v_1\rangle&0&\cdots&0\\0&\langle v_2,v_2\rangle&\cdots&0\\\vdots&\vdots&\ddots&\vdots\\0&0&\cdots&\langle v_n,v_n\rangle\end{bmatrix}$$ with all of the diagonal entries nonzero. We can then expand directly: $$Q(Q^TQ)^{-1}Q^Tb = Q(Q^TQ)^{-1}\begin{bmatrix}\langle v_1,b\rangle \\ \langle v_2,b\rangle \\ \vdots \\ \langle v_n,b\rangle\end{bmatrix} = Q\begin{bmatrix}{\langle v_1,b\rangle \over \langle v_1,v_1\rangle} \\ {\langle v_2,b\rangle \over \langle v_2,v_2\rangle} \\ \vdots \\ {\langle v_n,b\rangle \over \langle v_n,v_n\rangle} \end{bmatrix} = \sum_{j=1}^n {\langle v_j,b\rangle \over \langle v_j,v_j\rangle} v_j.$$