I am attempting to prove a fact about matrix approximations using the singular value decomposition.
For a given matrix $M$, we denote the rank $r$ approximation of $M$ by $A(r, M)$. I am trying to show that for $r_{1} \leq r_{2}$ then $A(r_{1}, M) = A(r_{1}, A(r_{2}, M))$ if the singular values of M are distinct. i.e. that taking a lower rank approximation of a higher rank approximation is the same thing as taking a lower rank approximation of the original matrix.
I know that for a matrix $M$ with rank $r$, the singular value decomposition is of the form: $$ U \Sigma V^* = \sum_{j=1}^r u_j \sigma_j v_j^* $$
and that the rank $r_k$ approximation is given by:
$$ \sum_{j=1}^{r_k} u_j \sigma_j v_j^* $$
It feels obvious that $A(r_2, M) = \sum_{j=1}^{r_2} u_j \sigma_j v_j^* $ will have singular values of $\sigma_1, \cdots \sigma_{r_2}$, so taking the $r_1$ approximation of that would give $\sum_{j=1}^{r_1} u_j \sigma_j v_j^* = A(r_{1}, M)$ as desired.
Though, I haven't used that the singular values are distinct, so I am confident that I must have done something wrong. Is there a step that I am skipping somewhere? Thanks in advance!!!
If there are multiple singular values, then the SVD is not unique, as some rotation matrices will commute with $\Sigma$. Consequently such an approximation gives only unique results if complete blocks of multiple singular values are cut away, that is, if $\sigma_{r_k+1}<\sigma_{r_k}.$ Otherwise the procedure does not produce a 'true' approximation, as the remainder is as large as some components of the approximation. Usually the word implies that the remainder is much smaller than the smallest components of the approximation.