In random matrix decomposition, how does a semi-orthogonal matrix capture the range of an input matrix?

147 Views Asked by At

I am reading this paper on probabilistic algorithms for matrix decomposition. I don't understand Section 1.2. Given a matrix $A \in \mathbb{R}^{m \times n}$, we want an "approximate basis for the range of $A$", $Q \in \mathbb{R}^{r \times n}$ such that:

$$ Q \text{ has orthonormal columns and } A \approx QQ^* A $$

I understand that the range of $A$ is the column space of $A$, but I don't understand how finding a matrix $Q$ s.t. $A \approx QQ^* A$ means that $Q$ is a good approximation of that column space.

2

There are 2 best solutions below

0
On BEST ANSWER

I was confused because I wondered why we cared that $Q$ was invertible because

$$ A \approx QQ^*A \implies QQ^* \approx I $$

But I don't think that's the right intuition. For me, the key was to write the expression like this:

$$ A \approx Q(Q^*A) $$

What we want is a matrix $Q$ with as few orthogonal columns as possible such that, if we project $A$ into this $k$-dimensional subspace and then reconstruct $A$, our reconstruction of $A$ is within some error tolerance.

3
On

There are a number of algorithms in the book for explicitly constructing this

On page $28$ you have algorithm $4.5$. Your question is how is finding matrix $Q$ a good approximaton of the column space. Given an $ m \times n$ matrix $A$ and an integer $l$ this computes an $m \times l$ orthonormal matrix $Q$ whose range approximates the range of $A$

  1. Draw an $n \times l $ SRFT test matrix $\Omega $ which is defined in $4.6 $
  2. Form the $m \times l$ matrix $Y = A \Omega $ using a (subsampled) FFT
  3. Construct an $ m \times l$ matrix $Q$ whose columns form an orthonorml basis for the range $Y$ e.g. using the $QR$ factorization.

So you generate this SFRT matrix, which is partly a Fourier matrix. Then you sample the original matrix $A$. This is how you get the column space of $A$. Then you take the reduced $QR$ decomposition and return the $Q$ matrix. So you would have $Q_{m \times l}$

Why would this be a good approximation? Well $ Y = A \Omega $ and we took $QR = Y$ right. All we did is a take a random matrix and form a bunch of random projections into its subspace using $\Omega$ then take this matrix $Q$ by using Gram Schmidt.

This paper is like $80$ pages long but there is actually analytical bounds on how approximate you get. It also depends on the method.