What are the fields for Low-rank approximation and Principal component analysis

813 Views Asked by At

I would like to start with practical applications of:

  • Principal component analysis
  • Low-rank approximation matrices

the problem that I found is that with some internet researches I can't really place this 2 topics in a specific branch of the math, and this means that I can't even find good resources about this 2 arguments.

I work primarily with 3d geometry, so nothing related to applied statistics or other fields.

1

There are 1 best solutions below

6
On BEST ANSWER

These topics are both closely related to the SVD in linear algebra. Trefethen's book Numerical Linear Algebra has a good discussion of the SVD. But I've seen PCA discussed more directly in Machine Learning textbooks.

Here's my attempt at an explanation of PCA via the SVD:

Let $x_1,\ldots,x_N \in \mathbb R^D$, and assume for simplicity that the mean $\frac{1}{N} \sum_{n=1}^N x_n = 0$.

In order to reduce the dimensionality of our data, we can try to find a matrix $U = \begin{bmatrix} u_1 & \cdots & u_K \end{bmatrix}$ such that each vector $x_n$ belongs (approximately) to the column space of $U$: \begin{equation*} x_n \approx U c_n \end{equation*} for some vector of coefficients $c_n \in \mathbb R^K$. We will then be able to specify each $x_n$ (to a good approximation) using only $K$ numbers rather than $D$ numbers. If $K \ll D$ this may be a big savings.

In other words, if $X = \begin{bmatrix} x_1 & \cdots & x_N \end{bmatrix}$, we want to find a matrix $U$ such that \begin{equation*} X \approx U C \end{equation*} for some matrix of coefficients $C$. If we choose $U$ well, then $UC$ will be a good low rank approximation to $X$. (The rank of $UC$ is at most $K$.)

But we know how to find an optimal low rank approximation to $X$. The SVD gives it to us. That's why the SVD can be used for compression. Suppose that \begin{equation*} X = \sum_{i=1}^r \sigma_i u_i v_i^T \end{equation*} is an SVD of $X$. The first $K$ terms of this sum give us a very good approximation to $X$: \begin{align*} X &\approx \sum_{i=1}^K \sigma_i u_i v_i^T \\ &= U \Sigma V^T \\ &= UC. \end{align*} The PCA algorithm returns this matrix $U$, computed from the SVD of $X$, as output.

This matrix $U$ is optimal in the following sense: if $\hat{U} \in \mathbb R^{D \times K}$ and $\hat{C} \in \mathbb R^{K \times N}$, then \begin{equation*} \| X - UC \|_F \leq \| X - \hat{U} \hat{C} \|_F. \end{equation*} Indeed, out of all $D \times N$ matrices of rank at most $K$, $UC$ is an optimal approximation of $X$ (with respect to either the Frobenius norm or the 2-norm).