To compute SVD of a matrix: Say you start with the matrix $A$ and you compute $v_1$. You can use $v_1$ to compute $u_1$ and $\sigma_1(A)$. Then you want to ensure you're ignoring all vectors in the span of $v_1$ for your next greedy optimization, and to do this you can simply subtract the rank $1$ component of $A$ corresponding to $v_1$. I.e., set $A^{\prime}=A-\sigma_1(A) u_1 v_1^T$. Then it's easy to see that $\sigma_1\left(A^{\prime}\right)=\sigma_2(A)$ and basically all the singular vectors shift indices by 1 when going from $A$ to $A^{\prime}$. Then you repeat.
Question 1: I didn't understand how the subtraction the rank $1$ shift the singular vectors and $\sigma_1\left(A^{\prime}\right)=\sigma_2(A)$?
To understand this I search for details and came cross the following statement,
Given a matrix $A_{m\times n}$, the SVD of $A$ is given by: $$ A = UΣV^T $$ where $U_{m\times m}$ is a orthogonal matrix, $Σ_{m \times n}$ is a diagonal matrix with non-negative entries on the diagonal, and $V_{n\times n}$ is a orthogonal matrix. For a specific singular value $σ_i$ and its corresponding left singular vector $u_i$ and right singular vector $v_i$, we have: $Av_i = σ_i u_i$ This means that $Av_i$ is a linear combination of the columns of $U$ with coefficients given by $σ_i u_i$. In other words, $Av_i$ is a vector in the column space of $U$. Now, consider the expression $(Av_i) v_i^T$. This is an $m \times n$ matrix that can be written as: $$ (Av_i) v_i^T = (σ_i u_i) v_i^T = σ_i (u_i v_i^T) $$ where $u_i v_i^T$ is an $m \times n$ matrix with a rank of $1$. In other words, $u_i v_i^T$ is a matrix that projects any vector onto the direction of $v_i$. So, $(Av_i) v_i^T$ can be seen as a matrix that scales the projection of any vector onto the direction of $v_i$ by $σ_i$. Therefore, the matrix $(Av_i) v_i^T$ is an outer product between the vector $Av_i$ and the vector $v_i^T$, scaled by the singular value $σ_i$.
Question 2: I didn't see why the $u_i v_i^T$ matrix projects any vector onto the direction of $v_i$? (which seems very different from dot product of $a^T b$ gives the projection of $a$ in $b$ direction where we consider everything is unit vector)
It will be a great help if anyone help me to figure out this. Thanks in advance.