Low rank linear regression

534 Views Asked by At

If we consider the linear regression problem $$\min_A \lVert Y-AX\rVert_2$$ where $A\in\mathbb{R}^{m\times n}$ and $X\in\mathbb{R}^{n\times N}$, $Y\in\mathbb{R}^{m\times N}$ and let the argument of its solution be $A^*$.

Then the best low rank approximation of $A^*$ is given by the Eckart–Young–Mirsky theorem - let's call it $A^*_r$.

My question: does $A^*_r$ solve $$\min_{A_r} \lVert Y-A_rX\rVert_2,$$ where the $A_r$'s are of the same rank as $A^*_r$?

2

There are 2 best solutions below

0
On

Let me try. Take $Y = (1,10)'$ and $A=diag(1,1/2)$,$X=(1,20)'$, Then the $A$ of rang 1, selected in the theorem should be $diag(1,0)$ which has error 10. If we took in stead $diag(0,1/2)$ then the error would be 1 so it looks like you need to refrace your question.

0
On

No, generally those two matrices are different. The paper Optimal Exact Least Squares Regression, Xiang et al. 2012 gives a fairly simple solution for second minimization problem.

Here is my implementation of that method in Python using the Numpy library for the typical case where the columns of A are linearly independent:

import numpy

def low_rank_regression(A, Z, rank):
    """
    Solve for Theta minimizing error in 
    Z ~ A Theta
    under the constraint that Theta has at most the given rank
    Let n be the number of observations, p the number of exogenous variables,
    and k the number of endogenous variables

    A - the design matrix, n x p
    Z - the data matrix, n x k
    rank - the rank of Theta

    Returns: Theta, p x k

    Assumes: the rank of A is at least `rank`

    See:
    "Optimal Exact Least Squares Rank Minimization" Shuo Xiang, Yunzhang Zhu, Xiaotong Shen, Jieping Ye
    """

    n, k = Z.shape
    assert A.shape[0] == n
    n,p = A.shape
    assert rank <= min(n,k,p)

    U,D,VT = numpy.linalg.svd(A, full_matrices=False)

    # Epsilon tolerance as in numpy.matrix_rank
    tolerance = numpy.finfo(A.dtype).eps * A.max() * max(A.shape)
    # Rank of A from the svd decomposition
    r = numpy.sum(D > tolerance)
    assert rank <= r

    W = U.T @ Z
    W_first_r_rows = W[:r]
    D_first_r_inv = numpy.diag(1/D[:r])

    W_low_rank = project_to_low_rank(W_first_r_rows, rank)
    Y_star = numpy.vstack(( D_first_r_inv @ W_low_rank, numpy.zeros(((p-r),k)) ))
    Theta = VT.T @ Y_star

    return Theta

def project_to_low_rank(M, rank):
    U,D,VT = numpy.linalg.svd(M, full_matrices=False)
    D[rank:] = 0
    return U @ numpy.diag(D) @ VT