What's the difference/connection between PCA and inverse Fourier transform?

4.3k Views Asked by At

Principle Component Analysis (PCA) finds the component with the highest contribution, which is very similar to the idea of inverse Fourier transform, which finds the frequency with the highest weight. Could someone help clarify their difference/connection. It seems that they are connected in some mathematical forms.

1

There are 1 best solutions below

2
On

There is indeed a connection but with the Fourier transform, and not it's inverse.

PCA is also known by Karhunen-Loève Expansion, this finds the optimal basis for your data (optimal in the sense that it minimizes the euclidean distance between the data points and their projections onto the new basis). If your data is translationally invariant (i.e. permutation of one data point generates another data point), then the optimal basis consists of Fourier vectors.

In other words, PCA on translationally invariant data will yield the DFT.

See https://en.wikipedia.org/wiki/Circulant_matrix but I will give a brief, handwaving explanation:

If you have translationally invariant data matrix X:

$X = \begin{pmatrix} 1 & 2 & 3 \\ 2 & 3 & 1 \\ 3 & 1 & 2 \\ \end{pmatrix} $

A circulant matrix $C_{ij} = c_{i-j}$ will circularly permute matrix X. Associated with $C$ is a permutation matrix $P$ which will rearrange the rows of $X$ into their original positions:

$PXC = X$

We can get the right singular vectors from:

$X^T X = (PXC)^ T PXC = C^T X^T P^{-1} PXC = C^T X^T XC$

and

$CX^T X = X^T XC$

(here I am assuming $C$ is also a permutation matrix, because I am both too lazy to prove the general case).

Since $C$ commutes with $X^T X$ they can be simultaneously diagonalized. The eigenvectors of a circulant matrix are the Fourier modes and thus the optimal basis for translationally invariant data are Fourier vectors with the singular values being the discrete Fourier coefficients.

Please also see https://en.wikipedia.org/wiki/Karhunen%E2%80%93Lo%C3%A8ve_theorem