Intuitive way to interpret the applications of SVD

203 Views Asked by At

I was learning SVD recently, and some applications of SVD (listed below) are given by my professor (not his original words, so I might made some mistakes when summarizing).

Let $A = U \Sigma V^T$ be a SVD of $A$, then...

  1. The first $r$ columns of $V$ form an orthonormal basis for $Row \thinspace A$. Thus dim(Row A) $= r$.
  2. The remaining columns of $V$ form an orthonormal basis for $Nul \thinspace A$. Thus dim(Nul A) $= n-r$.
  3. The first $r$ columns of $U$ form an orthonormal basis for $Col \thinspace A$. Thus dim(Col A) = rank(A) $= r$.
  4. The remaining columns of $U$ form an orthonormal basis for $Nul \thinspace A^T$. Thus dim(Nul transpose(A)) $= m-r$.

However, I'm not able to intuitively understand why they are true. I can see that $Row \thinspace A$ is the transpose of $Col \thinspace A$ and $Nul \thinspace A$ is the transpose of $Nul \thinspace A^T$ (again, as suggested by @Omnomnomnom, this statement doesn't make sense), so I feel like there's something going on here, but I don't know what it is.

Also, as suggested by @Omnomnomnom, $Row \thinspace A$ is orthogonal to $Nul \thinspace A$ and $Col \thinspace A$ is orthogonal to $Nul \thinspace A^T$.

1

There are 1 best solutions below

0
On BEST ANSWER

Here's a way to think about SVD to help understand what's going on. For simplicity, I'll focus on the case in which $m \leq n$, but a similar analysis can be applied when $m > n$.

Let $u_1,\dots,u_n$ denote the columns of $U$, let $v_1,\dots,v_n$ denote the columns of $V$, and let $e_1,\dots,e_m$ denote the standard basis of $\Bbb R^m$. Let $\sigma_1,\dots,\sigma_r$ denote the non-zero singular values of $A$. Recall that $A$ encodes the linear transformation from $\Bbb R^n$ to $\Bbb R^m$ given by $T_A(x) = Ax$. SVD breaks that process into $3$ parts: the orthogonal transformation $V^T$, the scaling transformation $\Sigma$, and the orthogonal transformation $U$. That is, $T_A = T_U \circ T_\Sigma \circ T_{V^T}$.

The transformation $V^T$ takes the vector $v_i$ to the vector $e_i$ for every single $i$. In other words, it changes our orthogonal axes from the vectors $v_i$ to the standard vectors $e_i$. $\Sigma$ takes the vectors $e_1,\dots,e_r$ in $\Bbb R^n$ and scales them to yield $\sigma_1 e_1,\dots,\sigma_r e_r$ in $\Bbb R^m$; all other $e_i$ are mapped to zero. Finally, $U$ takes the vectors $e_1,\dots,e_m$ to $u_1,\dots,u_m$. That is, $U$ encodes another change in orthogonal coordinate system.

We can keep track of the column space and null space during each stage. The column space of $V^T$ is all of $\Bbb R^n$, since it moves things around without mapping anything to zero. The column space of $\Sigma V^T$ is the span of $e_1,\dots,e_r$ in $\Bbb R^m$. The null space of $\Sigma V^T$ is the span of $v_{r+1},\dots,v_n$, since those are the $v_i$ which are sent to $e_i$ and from there to $0$. Finally, the column space of $U \Sigma V^T$ is $u_1,\dots,u_r$, since those are the $e_1,\dots,e_r$ from $\Sigma V^T$ are sent to. $U$ doesn't send anything new to $0$, so the null space is still $v_{r+1},\dots,v_n$, as before.

Now, we have everything we need to know about $Null(A)$ and $Col(A)$. By applying the same analysis to $A^T = V\Sigma^TU^T$, we can say similar things about the row space and left null space. Alternatively, it suffices to note that $Row(A) = Null(A)^\perp$ and $Null(A^T) = Col(A)^\perp$.

I hope that helps.