I have an application where rectangular, $m \times n$ matrices appear, where every row represents a data point with noise in all of its components, except for some components which are part of a "pattern mask," and which contain a signal on top of the noise.
For example, consider the following matrix: $$\bf{M} = \begin{pmatrix} 1.1 & 0.2 & -0.1 & 0.9 \\ 0.9 & 1.2 & 0.2 & -0.2 \\ 0.9 & -0.2 & 0.2 & 1.2\end{pmatrix},$$ which contains three data points: the first and last with pattern mask $\begin{pmatrix}1 & 0 & 0 & 1\end{pmatrix}$, and the middle one with mask $\begin{pmatrix}1 & 1 & 0 & 0\end{pmatrix}$.
I want to use the singular value decomposition of this matrix to find what the pattern masks are, as well as what their number is, through finding which singular values are "meaningful." My first thought here was to use the Marchenko-Pastur distribution of singular values of a random matrix and to see where the singular values I find deviate strongly from what I would expect to see in the case of a purely random matrix. However, I see two potential problems with this:
- Marchenko-Pastur assumes zero mean, whereas for me this is not necessarily going to be the case.
- The variance/standard deviation of the noise is not known in advance.
Is there a simple way to adapt the theoretical results I mentioned above to my situation? My eventual goal is to cluster the data points, either directly using SVD (which I am not sure is possible) or by using the SVD to denoise the matrix (through building a low-rank approximation) before running $k$-means on the denoised version.
In practice, Marchenko-Pastur might only work under special conditions such as MP-PCA. You can try to subtract the mean(snap-wise). However, you need the variance of the noise. You can use the findings of this paper(highly recommended). They derived a way to find the rank of a non-square matrix in the case of unknown noise. The code of their method is also available publically.