I'm trying to show the following: Let $A$ be an $m \times n$ matrix with $m\geq n$. Prove that the following statements are equivalent.
- $A^{\top} A = I_n$.
- $P = A A^{\top} $ is an orthogonal projection in $\mathbb{R}^m$ onto a subspace of dimension $n$.
- $A$ is an isometry, i.e., $\Vert Ax\Vert_2 = \Vert x \Vert_2$ for all $x\in\mathbb{R}^n$.
- All singular values of $A$ is $1$.
Easily, we have (1)$\iff$(3)$\iff$(4) by elementary algebra manipulation and characterization of extreme singular values of $A$ (See Eq.(4.5) in the book). We also have (1)$\implies$(2) by checking $P^2 = P$ and $P^\top = P$. However, I'm not able to see how (2) implies any others.
For example, I tried to show (2)$\implies$(1): Since $P=A A^\top$ satisfies $P^2 = P$ and $P^\top = P$ (trivial from the definition of $P$), hence only the former is useful, which has a relation to $A^\top A$ as $$ P^2 = (A A^\top)(A A^\top) = A (A^\top A) A^\top. $$ But I don't have an easy way to conclude $A^\top A = I_n$ from here.
Just figured this one out: the key idea is that the information $\operatorname{rank}(P) = n$ is crucial. I omitted this fact previously.
Specifically, when proving (1)$\implies$(2), we also need to show that $\operatorname{rank} (P) = \operatorname{rank} (AA^{\top}) = n$. But since $A^{\top} A = I_n$, $$ n = \operatorname{rank}(I_n) = \operatorname{rank}(A^{\top} A) \leq \operatorname{rank} (A) \leq n $$ as matrix multiplication can only reduce the rank, hence $\operatorname{rank}(A) = n$. As this also implies $\operatorname{rank} (A^{\top} ) = n$, we're left to check whether $\operatorname{Im} A^{\top} \cap \ker A = \varnothing $. If this is true, then $\operatorname{rank} (A A^{\top} ) = n$ as well, and we're done. But it's well-known that $\operatorname{Im} A^{\top} = (\ker A)^{\perp} $, and we're done.
We can now show (2)$\implies$(1): we want to show that if $P = A A^{\top} $ is an orthogonal projection on a subspace of dimension $n$, then $A ^{\top} A = I_n$. Observe that since $P^2 = P$, $$ (A A^{\top} ) (A A^{\top} ) = A A^{\top} \iff A (A^{\top} A - I_n) A^{\top} = 0. $$ Now, we use the fact that $\operatorname{rank} (P) = \operatorname{rank} (A A^{\top} ) = n$. From the previous argument, we know that $\operatorname{rank} (A) = \operatorname{rank} (A^{\top} ) = n$, and hence $$ A (A^{\top} A - I_n) A^{\top} = 0 \implies A (A^{\top} A - I_n) = 0 $$ as $A^{\top} $ spans all $\mathbb{R} ^n$. Taking transpose, we again have $$ (A^{\top} A - I_n)^{\top} A^{\top} = 0 \implies (A^{\top} A - I_n)^{\top} = 0 $$ since again, $A^{\top} $ spans all $\mathbb{R} ^n$. We hence have $A^{\top} A = I_n$ as desired.