prediction: find whether NMF feature matrix presented in new dataset

350 Views Asked by At

The problem is simple: first decompose a training set using non-negative matrix factorisation (NMF), which yields W(so-called feature matrix) and H, allowing approximate dot(W, H)==data_train. Then, I want to find if samples in a unseen testing set (lets say data_test) have features presented in W i.e. find H' which assign test samples their labels/classifications/whatever the term.

I believe there should be a neat way to do this, like we can train a Gaussian mixture model and use it to predict labels. However tried the following methods/measurements on data_train (which means I can compare the results with ground truth H):

1) pseudoinverse to solve H' in dot(W, H')=data_test;

2) same equation, non-negative least squares (NNLS) to solve H';

3) use W as a matrix encoding probability distribution, calculate expectations for samples using according entries in W and data_test, set a threshold for it;

4) spearman correlation between samples and W, set a threshold for it;

and performances are kind of bad. H plotted as the blue line. They do have some consents on several peaks albeit overall performances are terrible (~pearson(H, H')==0.3).

Decomposition process of NMF is rather stable but seems like I just can't 'reverse' it, even when try to find H' using NNLS (which I think has a similar algorithm to NMF in converging part). In other words, what I'm confused with is that existing of different pairs of W and H (given by that fact data=dot(W,inverse(Q),Q,H) for random matrix Q) do make sense, but shouldn't I be able to converge to a place near the local minima of H, if W is fixed?

To sum up, my questions are:

1) is it even possible to find a reasonable H' (or to say, reproduce H given W and data)?

2) if it's not the case, how should I interpret W? In some biological papers NMF is used on find signatures (W) in genome data (e.g. matrix contains expression level catalogue), but if W can't be used to make predictions or be compared with each other, wouldn't the signatures meaningless other than some fancy patterns?

1

There are 1 best solutions below

0
On

Did some test and found out what's happening.

First of all, NMF should be able to find W or H given V, e.g. set update_H to False in python sklearn.decomposition.non_negative_factorization, and it is stable. The main issue is that, we do have W inverse(Q) Q H, however NMF itself is not robust in the way that if we feed the model with dot(W, inverse(Q)) as W, seems it will not converge H to dot(inverse(Q), H) (strategy: minimize L2 norm). My data actually has been tweaked by a Q.

Though not sure if this is a rational explanation, as the model would converge to some other random local minimas don't make a lot of sense to me.