Construction of the SVD explanation

300 Views Asked by At

I am reading the book Mathematics for Machine Learning.

Section 4.5.2 is titled "Construction of the SVD" and it explains how the SVD of a rectangular matrix $A \in \mathbb{R}^{mxn}$ is constructed.

On pages 123-124, the left and right singular vectors are computed through the exploitation of the symmetric positive semidefinite matrices $A^TA$ and $AA^T$. At the same time, the singular values are constructed too within the same procedure.

In the subsequent part, the authors define the following procedure:

The last step is to link up all the parts we touched upon so far. We have an orthonormal set of right-singular vectors in $V$. To finish the construction of the SVD, we connect them with the orthonormal vectors $U$. To reach this goal, we use the fact the images of the $v_i$ under $A$ have to be orthogonal, too. We can show this by using the results from Section 3.4. We require that the inner product between $Av_i$ and $Av_j$ must be $0$ for $i \neq j$. For any two orthogonal eigenvectors $v_i$, $v_j$, $i \neq j$, it holds that $$(Av_i)^T(Av_j) = v_i^T (A^TA) v_j = v_i(\lambda_j v_j) = \lambda_j v_i v_j= 0 \quad(4.77)$$ For the case $m \geq r$, it holds that $\{Av_1,...,Av_r\}$ is a basis of an $r$-dimensional subspace of $\mathbb R^m$. To complete the SVD construction, we need left-singular vectors that are orthonormal: We normalize the images of the right-singular vectors $Av_i$ and obtain $$ u_i := \frac{Av_i}{‖Av_i‖} = \frac{1}{\sqrt{\lambda_i}} Av_i = \frac{1}{\sigma_i}Av_i, \quad (4.78)$$ where the last equality was obtained from (4.75) and (4.76b), showing us that the eigenvalues of $AA^T$ are such that $σ^2_i = \lambda_i$.

Section 3.4 indicated in the previous excerpt defines orthogonality by means of the inner product and orthogonal matrices. Equations (4.75) and (4.76b) link together the eigenvalues of the symmetric matrices $A^TA$ and $AA^T$ with the singular values of $A$.

What is not clear to me is how this part effectively connects left and right singular vectors. How the results from 4.77 and 4.78 are used to define the final left and right singular vectors?

1

There are 1 best solutions below

0
On

You know that $$A^TAv_j=\sigma^2_jv_j$$ Multiplying by A $$AA^T(Av_j)=\sigma^2_j(Av_j)$$ Which says that $$Av_j \: is \:an \:eigenvector \:of \:AA^T$$ $$Also\: ||Av_j||^2=v^T_jA^TAv_j=\sigma^2_jv^T_jv_j$$ $$So\: if \:Av_j's\: are \:normalized \:then \:||Av_j||^2=\sigma^2_j$$ $$So \:\frac{Av_j}{\sigma_j}=u_j$$ $$Doing\: this\: for \:all \:u_j's \:gives\: AV=U\Sigma$$ Or $$A=U(\Sigma)V^T$$