Solving for $X$ using the SVD of $QX$ when $Q$ is orthogonal

77 Views Asked by At

I inherited some code (see below), and I am not quite sure what it does. It is part of a factor analysis-type model that learns a latent variable $X \in \mathbb{R}^{N \times K}$ with $N > K$ that is unidentifiable up to scale and rotation [1]. I think the basic is idea that since

$$ X = USV^{\top} \iff QX = WSV^{\top} $$

for $W = QU$ and any orthogonal matrix $Q$, we want to decompose $W$ into an orthonormal matrix and an orthogonal matrix with the same dimensions of $X$:

$$ QX = QU_k $$

where $U_k$ is $U$ truncated up to $K$ left singular vectors. But we don't know actually know $Q$ or $U$; we only have $W$ and $QX$. The code seems to solve for or define $X$ using the equation

$$ (W_k^{\top} W_k) X^{\top} = W_k^{\top} $$

which with some re-arrangement gives

$$ X^{\top} = (U_k^{\top} U_k)^{-1} U_k^{\top} Q. $$

I recognize this as multipyling $Q$ by the pseudoinverse of $U_k$, but I don't see why this works. Is that correct? Is this an established trick for solving for $X$ when arbitrarily scaled and rotated?

Code

def stabilize(X):
    """Fix the rotation according to the SVD.
    """
    D       = X.shape[1]
    U, _, _ = np.linalg.svd(X, full_matrices=False)
    L       = np.linalg.cholesky(np.cov(U.T) + 1e-6 * np.eye(D)).T
    X       = np.linalg.solve(L, U.T).T
    X      /= np.std(X, axis=0)
    return X

[1] Not 100% on this. It could be identifiable up to just rotation.