I am currently studying about randomized low-Rank Approximation of a matrix. In the problem's statement, given $m$ x $n$ $A$,it is referred that we want to minimize
$\|A-Q_{k}Q^{T}_{k}A\|$
and for this reason, we seek a $m$ x $l$ $Q$ with orthonormal columns that approximates well the range(A). The challenge is to keep $l$ as low as possible. What I don't get is how the number $l$ affects accuracy.
I think that if $l=n$, the above norm would be zero because $Q$ would be orthogonal and the result of the multiplication would be the identity matrix. I saw that if $l < n$ the result is not the identity matrix but I don't deeply understand the reason.
I ran a simple test in Matlab with $qr()$ and saw that if the returned matrix $Q$ is not square, $QQ^{T}$ is not the identity matrix. How is this explained? I suppose that it is quite fundamental but I am stuck.
C = randn(5,4);
[q,~] = qr(C,0); % Economic qr
q*q'
Thank you in advance
Note that the matrix $Q$ has rank at most $\min(m, l)$, namely its rank is at most $l$. Since the product of matrices has a rank no more than the rank of any of the matrices that were multiplied, it follows that $Q Q^T$ has rank at most $l$ (in fact, it has rank exactly $l$). As $Q Q^T$ is an $m \times m$ matrix with rank at most $l$, it follows that if $l < m$, then $Q Q^T \neq I_m$, since $I_m$ has rank $m$.
The matrix $Q Q^T$ represents an orthogonal projection onto the space spanned by the columns of $Q$. So the matrix $Q Q^T A$ is an orthogonal projection of each of $A$'s columns onto the space spanned by the columns of $Q$. Thus, the quantity $\|A - Q Q^T A\|$ can be thought of as a measure of distance between the subspace spanned by $Q$'s columns and the space spanned by $A$'s columns. Note that as we increase the number of columns $l$ of $Q$, the space spanned these $l$ columns becomes a better approximation of the space spanned by $A$'s columns. As you have observed, if we let $m = l$, then the spaced spanned by $Q$'s columns is just $\mathbb{R}^m$ itself, and so an orthogonal projection onto $\mathbb{R}^m$ is simply the identity (at which point the approximation becomes exact).
The motivation for finding low-rank approximations is that they are easier to deal with, calculate, and manipulate. Furthermore, in many applications there is little extra benefit to be offered by working with the exact forms of the matrices. Indeed, low-rank approximations can often be quite good, even with rank $l \ll m$.
EDIT: To see why $QQ^T$ is the matrix applying an orthogonal projection onto a space spanned by $Q$'s columns, note that given an orthonormal basis $\{\vec{u}_1, \vec{u}_2, \cdots, \vec{u}_l\}$ of a linear subspace $\mathcal{U} \subset \mathbb{R}^m$, the orthogonal projection of some vector $\vec{v} \in \mathbb{R}^m$ onto $\mathcal{U}$ is given by $$\sum_{i = 1}^l (\vec{u}_i \cdot \vec{v}) \vec{u}_i$$ The expression above represents the sum of the individual projections of $\vec{v}$ onto each of the $\vec{u}_i$ (which is the standard way of calculating an orthogonal projection onto a multidimensional subspace). It can be directly verified that if the $\vec{u}_i$ are the columns of $Q$, $QQ^T$ is just another way of writing the above expression. Indeed, $$Q Q^T \vec{v} = Q \begin{bmatrix} \vec{u}_2 \cdot \vec{v} \\ \vec{u}_2 \cdot \vec{v} \\ \vdots \\ \vec{u}_l \cdot \vec{v} \end{bmatrix} = \sum_{i = 1}^l (\vec{u}_i \cdot \vec{v}) \vec{u}_i$$