Why is $U ^tU $ an orthogonal projection on $\operatorname{Im}(U)$?

382 Views Asked by At

Let $U \in M_{n,k}(\mathbb{R})$ such that : $^t UU = I_k$. Then I would like to understand geometrically why $U ^t U$ is the orthogonal projection on $\operatorname{Im}(U)$ ?

When $n = k$ we are dealing with orthogonal matrix and hence it's quite easy to see why the result holds. Yet here I don't see how to interpret the transformation $U ^t U$ knowing that $U ^t U = I_k$.

N.B : I know how to prove it algebraically, I am really looking for an intuitive understanding of why it's an orthogonal projection on $\operatorname{Im}(U)$.

Thank you !

2

There are 2 best solutions below

1
On BEST ANSWER

Let $u_1,\dots,u_k\in\Bbb R^n$ be the columns of $U$ and denote by $\langle v,w\rangle$ the standard dot product on $\Bbb R^n$. The equation $({}^tU)U=I$ amounts to $$ \langle u_i, u_j \rangle = \begin{cases} 1 & i=j, \\ 0 & i\neq j.\end{cases} $$ In other words, the vectors $u_1,\dots,u_k$ form an orthonormal basis of $\operatorname{Im}(U)$. Given any vector $v\in\Bbb R^n$, you have $$ ({}^tU)v = \begin{pmatrix} \langle u_1,v\rangle \\ \vdots \\ \langle u_k, v\rangle \end{pmatrix}, $$ so it collects all the dot products with the vectors $u_i$.

Now all you need to know is that for $\|u\|=1$, the dot product $\langle u,v\rangle$ determines the length of the orthogonal projection of $v$ onto $u$. So $v$ projected orthogonally onto $u$ is equal to $\langle u,v\rangle\, u$.

If you do this for each of the orthonormal vectors $u_1,\dots,u_k$ and add up the projections, you have projected $v$ orthogonally onto their span $\operatorname{Im}(U)$. This is exactly what happens when calculating $U({}^t U)v$: $$ U({}^t U)v = U \begin{pmatrix} \langle u_1,v\rangle \\ \vdots \\ \langle u_k, v\rangle \end{pmatrix} = \langle u_1, v\rangle\, u_1 +\cdots+ \langle u_k, v\rangle\, u_k. $$

1
On

Probably not as intuitive as you'd want but this really helped me in understanding projections.
Let $Y$ be any vector $\in \mathbb R^n$
Let $C(U)$ denote column space of $U$
Say we want to decompose $Y$ to $\hat Y$ and $E$ st $Y=\hat Y+E$ where $\hat Y\in C(U)$ and $E\perp C(U)$.
Notice $\hat Y$ is the orthogonal projection of $Y$ to $C(U)$
That is, if $\forall v\in C(U), v^TE=0$
Since $v=Ux$ for some $x\in\mathbb R^k$ and $E=Y-\hat Y$ and $\hat Y= Ux_0$ for some $x_0\in\mathbb R^k$ we get $$x^TU^T(Y-Ux_0)=0\,\,\forall x\in \mathbb R^k$$ iff $$U^T(Y-Ux_0)=0\implies x_0=(U^TU)^{-1}U^TY\implies \hat Y=U(U^TU)^{-1}U^TY$$ Thus the orthogoanl projection matrix is $U(U^TU)^{-1}U^T$