In non-negative matrix factorization, what is the difference between row clustering and column clustering in the following context?

118 Views Asked by At

In Orthogonal Nonnegative Matrix Tri-factorizations for Clustering,

$min||X-FG^T||^2_{F}$,

s.t. $F^TF=I,G^TG=I, F>=0, G>=0$. $X\in\mathbb{R}^{d\times n}$, $F\in\mathbb{R}^{d\times k}$, $G\in\mathbb{R}^{n\times k}$.

where $F$ is the cluster indicator matrix for clustering rows and $G$ is the cluster indicator matrix for clustering columns.

Can we understand it from other from another perspective as follows?

For example, the $i$-th row of $X$ denotes a feature, each element is a sample in this row feature. the $i$-th row of $F$ has k elements, so the k elements are clusters of n samples in $X$.

1

There are 1 best solutions below

0
On

The Nonnegative Matrix Tri-factorization (NMTF) variant or 3 factors NMF, is called co-clustering method, because columns and rows of the data matrix $X$ are clustered simultaneously. Co-clustering is applied to two-dimensional matrices where the clustering of both dimensions is meaningful.

The interpretation of factors depends on the context and on how data elements are arranged into the data matrix $X$.

Let $X \in \mathbb{R}^{d \times n}_+$, so we can approximate it by

$$X \approx AB,$$

where $A \in \mathbb{R}^{d \times k}_+$, $B \in \mathbb{R}^{k \times n}_+$

Using the same data matrix we can write

$$\tilde{X} = X^T \approx B^TA^T = \tilde{B}\tilde{A},$$ with $B^T= \tilde{B}$ and $A^T=\tilde{A}$

So the meaning of rows/columns of the factors A and B is completely context-dependent.

Here, for the model you mentioned, each column of the data matrix $X$ is a feature vector for the corresponding data sample. The double orthogonality helps to obtain sparse rows/columns. The $k$ columns of $F$ are constrained to be orthogonal (independent) so that each column contains the centroids of one of the $k$-th cluster. The same thing with the $k$ rows of $G^T$, each row contains the centroids of one of the $k$-th cluster.