Efficient computation of the dot product of a matrix, which is a chain of Kronecker products, and a vector

481 Views Asked by At

The problem is this:

I need to compute $U.v$ where $v$ is a $2^6$ dimensional vector and $U$ is a chain of Kronecker products of $6$ $2\times2$ matrices.

That is I need to compute

$$U.v = (U_1\otimes U_2\otimes U_3\otimes U_4\otimes U_5\otimes U_6).v$$

I could not find a relevant property of such a product form so any computational advice will also be appreciated.

The matrices $U_i$ all are drawn from the set of Pauli matrices if that helps.

1

There are 1 best solutions below

3
On BEST ANSWER

$\newcommand\vec{\textsf{vec}}\newcommand\reshape{\textsf{reshape}}$You should read Section 1.3.7 from Golub and Van Loan's book Matrix Computations. In particular, this question is a special case of their exercise P1.3.9:

Suppose $A^{(k)}\in \mathbb{R}^{n_k \times n_k}$ for $k=1,\ldots,r$ and that $x\in \mathbb{R}^n$ where $n = n_1\cdots n_r$. Give an efficient algorithm for computing $y = (A^{(r)}\otimes \cdots \otimes A^{(2)}\otimes A^{(1)})x$.

The solution relies on the equality $$Y = C X B^T \iff \vec(Y) = (B\otimes C)\vec(X),$$ where $\vec(X)$ is the (linear) map that stacks the columns of a matrix from left-to-right into a long vector. In particular, $$y = (A^{(r)}\otimes \cdots \otimes A^{(1)})x \iff \reshape(y,\tfrac{n}{n_r},n_r) = (A^{(r-1)}\otimes \cdots\otimes A^{(1)})\reshape(x,\tfrac{n}{n_r},n_r)(A^{(r)})^\mathsf{T}$$ where $\reshape(A,m,n)$ is the matrix of shape $m\times n$ defined by $\vec(\reshape(A,m,n)) = \vec(A)$. This gives us the following recursive algorithm for this problem:

function y = kronmatvec(A[r,...,1], x)
    if r equals 1
         return A[r] * x
    end

    Let n, nr = length(x), shape(A[r])[0]
    Reshape x = reshape(x, n / nr, nr)
    Compute Z = x * transpose(A[r])

    for each column i = 1,2,...,nr
        Compute Y[:,i] = kronmatvec(A[r-1,...,1],Z[:,i])
    end

    return vec(Y)
end

To analyze the runtime in floating point operations, assume $n_i = c$ uniformly and write $T(r)$ for the floating point operations used in calling kronmatvec(A[r,...,1], x). Note that $T(1) \leq 2c^2$. In general, $T$ satisfies the recurrence $$ T(r) \leq 2c^{r+1} + cT(r-1) \quad\text{ which gives }\quad T(r) \leq 2r c^{r+1} = 2\tfrac{c}{\log c}n\log n \text{ flops}. $$ Naively computing a matrix-vector in this dimension takes on the order of $2 n^2$ flops so this is an immense speedup for large $n$ and $c = o(n)$.

See a Python/NumPy implementation here along with runtime benchmarks. This recursive formulation appears to be practically faster once $n > 1000$, and becomes much much faster after that point.

Realistically, since your matrices are of size $2^6 = 64$, the fastest thing to do without significant code optimization is to compute the Kronecker product and matrix product explicitly.