Why is the singular value matrix $\Sigma$ of size $m \times n$

1.1k Views Asked by At

I am having a problem seeing why for the singular value decomposition the matrix of singular values $\Sigma$ has size $m \times n$.

Let's say we want to decompose $X \in \mathbf{R}^{m \times n}$

Then we have $X^\top X = V \Sigma^2 V^\top$. As $X^\top X$ is real symmetric (and positive semi definite... not sure if that matters here) we have a full set of orthogonal eigenvectors, therefor $V \in \mathbf{R}^{n\times n}$, as $X^\top X \in \mathbf{R}^{n\times n}$. And thus necessarily $\Sigma \in \mathbf{R}^{n\times n}$, otherwise the product $V \Sigma^2 V^\top$ would not work out dimensionwise.

In the same fashion we have $X X^\top = U \Sigma^2 U^\top$. As $X^\top X$ is real symmetric we have a full set of orthogonal eigenvectors, therefor $U \in \mathbf{R}^{m\times m}$, as $X X^\top \in \mathbf{R}^{m\times m}$. And thus necessarily $\Sigma \in \mathbf{R}^{m\times m}$

So... which is it? Is $\Sigma$ of size $n \times n$ or $m \times m$... neither of which are $m \times n$, which supposedly is the size it actually has.

I am probably missing something very obvious. Can you point me to it?

1

There are 1 best solutions below

1
On BEST ANSWER

For a full SVD, $\Sigma$ has the same size as $X$, i.e. $m\times n$. When $m\ne n$, $\Sigma$ is not a square matrix and $\Sigma^2$ does not make sense.

$X^\top X$ is not $V\Sigma^2V^\top$ in general, but $(U\Sigma V^\top)^\top(U\Sigma V^\top)=V\Sigma^\top\Sigma V^\top$.

$XX^\top$ is not $U\Sigma^2U^\top$ in general, but $(U\Sigma V^\top)(U\Sigma V^\top)^\top=U\Sigma\Sigma^\top U^\top$.

There is also an "economic SVD", in which you strip away the zero singular values and the associated singular vectors in $U,\Sigma$ and $V$ from a full SVD, and keep only the nonzero singular values and their associated singular vectors. If $X$ is an $m\times n$ matrix of rank $k\le\min(m,n)$, then in an economic SVD, $\Sigma$ is a $k\times k$ positive diagonal matrix and $U,V$ are (usually rectangular) matrices of sizes $m\times k$ and $n\times k$ respectively with orthonormal columns. In this case, you do have $X^\top X=V\Sigma^2V^\top$ and $XX^\top=U\Sigma^2U^\top$.