Vector Multiplication with Multiple Kronecker Products

Question

Vector Multiplication with Multiple Kronecker Products

3k Views Asked by Bumbble Comm At 25 Mar 2026 - 7:00

My question concerns matrix-vector multiplications when your matrix has Kronecker structure, which can be done faster in that case.

I know how to compute this for a matrix $A = A_1 \otimes A_2$, which has two components $A_i$: $$Ax = (A_1 \otimes A_2)x = (A_1 \otimes A_2)vec(X) = vec(A_2XA_1^T)$$ where $vec(X) = x$ is the vectorization of $X$.

However, I have no idea how to proceed for more components $A_i$. I can imagine doing something as follows: $$Ax = (A_1 \otimes A_2 \otimes \cdots \otimes A_n)x = vec((A_2 \otimes \cdots A_n)XA_1^T) = (I_m \otimes A_2 \otimes \cdots \otimes A_n)vec(XA_1^T)$$ which provides me again with an actual matrix-vector multiplication. I was hoping to get a large identity matrix on the left-hand side this way, but no luck.

(EDIT: I realised it is impossible to do it this way, as the matrices $X$ and $A_1^T$ do not have the same dimensions.)

I tried looking it up on-line, but I mainly get the case for two components. Can anyone of you help me out? Thanks!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2022-02-11 09:27:46

I will start with briefly introducing vector-Kronecker multiplication algorithms and related publications. Then, I will try to explain how these algorithms work with a small running example. However, the discussion of these algorithms is not trivial due to the curse of dimensionality. If you want to implement these algorithms, you should read related papers and work on some examples with Kronecker products of three matrices.

Introduction

Shuffle algorithm (Plateau, 1985) is an efficient vector-Kronecker product algorithm that can be used to compute multiplication of a vector and Kronecker product of rectangular matrices. Another approach (Pot-RwCl algorithm) is proposed by Buchholz et al. (2000), in which nonzero elements of the larger matrix are obtained on the fly and multiplied with the corresponding elements of the vector.

Dayar (2012) discusses the Shuffle algorithm and its usage in numerical solution of multi-dimensional Markov chains. Dayar and Orhan (2015) proposes a modification in Shuffle algorithm so that it requires less floating-point operations when there are zero rows and columns in smaller matrices.

Fackler (2019) proposes further improvements by showing how shuffling of data can be (largely) avoided and providing a simple method to determine the optimal ordering of the workflow. You may also find a MATLAB implementation by Gleich.

You may find detailed pseudocodes and examples for Shuffle algorithm, Pot-RwCl algorithm, and modified shuffle algorithm in (Dayar and Orhan, 2015). Note that, since these algorithms are applied to Markov chains, they multiply the vector from the left of the Kronecker product of matrices.

Problem

Let $$ x \in \mathbb{R}^{c_1 \ldots c_n \times 1} ~~~\text{ and }~~~ A = (A_1 \otimes \ldots \otimes A_n), $$ where $A_k \in \mathbb{R}^{r_k \times c_k}$ for $k = 1, \ldots, n$. We are interested in computing $$ y = A x $$ without explicitly generating the matrix $A$. Without loss of generality, assume that $A_k$ matrices are $0$-indexed that is the row and column indices are $\{0,\ldots,r_k-1\}$ and $\{0,\ldots,c_k-1\}$, respectively.

It becomes cumbersome as multi-dimensional indices are introduced. However, it is much more difficult to discuss an algorithm including Kronecker products (or tensors) without using multi-dimensional indices. Let $A_k(i_k,j_k)$ denote the row $i_k$ and column $j_k$ of $A_k$, then $A$ can be written elementwise as $$ A((i_1,\ldots,i_n),(j_1,\ldots,j_n)) = \prod_{k=1}^{n} A_k(i_k,j_k) $$ by the definition of Kronecker products.

It will also be useful to define a subvector of a column vector. We let $$ x(i_1,\ldots,i_{k-1},:,i_{k+1},\ldots, i_n) = \left[ \begin{array}{c} x(i_1,\ldots,i_{k-1},0,i_{k+1},\ldots, i_n) \\ \vdots \\ x(i_1,\ldots,i_{k-1},c_{k}-1,i_{k+1},\ldots, i_n) \end{array} \right] $$ where $i_{k'} = 1,\ldots,c_{k'}$ for $k = 1,\ldots,k-1,k+1,\ldots,n$.

Example. Let $$ A_1 = \left[ \begin{array}{cc} 0 & 5 \\ 7 & 0 \end{array} \right], ~~ A_2 = \left[ \begin{array}{cc} 0 & 0 \\ 0 & 0 \\ 1 & 3 \\ 2 & 0 \end{array} \right] , ~~ \text{ and } ~~ A_3 = \left[ \begin{array}{ccc} 0 & 0 & 2 \end{array} \right] $$ with $r_1 = 2$, $c_1 = 2$, $r_2 = 4$, $c_2 = 2$, $r_3 = 1$, and $c_3 = 3$. Nonzero elements of these matrices are $$ A_1(0,1) = 5, ~~ A_1(1,0) = 7 , \\ A_2(2,0) = 1, ~~ A_2(2,1) = 3 , ~~ A_2(3,0) = 2, ~~ \text{ and } ~~ A_3(0,2) = 2. $$ Then, $A \in \mathbb{R}^{8 \times 12}$ includes 6 nonzero elements, which are $$ \begin{align} A((0,2,0),(1,0,2)) = 5 \times 1 \times 2 = 10, ~~ & A((0,2,0),(1,1,2)) = 5 \times 3 \times 2 = 30, \\ A((0,3,0),(1,0,2)) = 5 \times 2 \times 2 = 20, ~~ & A((1,2,0),(0,0,2)) = 7 \times 1 \times 2 = 14, \\ A((1,2,0),(0,1,2)) = 7 \times 3 \times 2 = 42, ~~ & A((1,3,0),(0,0,2)) = 7 \times 2 \times 2 = 28. \end{align} $$ As a result $y \in \mathbb{R}^{8 \times 1}$ is given by $$ y = \left[ \begin{array}{c} y(0,0,0) \\ y(0,1,0) \\ y(0,2,0) \\ y(0,3,0) \\ y(1,0,0) \\ y(1,1,0) \\ y(1,2,0) \\ y(1,3,0) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ A((0,2,0),(1,0,2)) ~ x(1,0,2) + A((0,2,0),(1,1,2)) ~ x(1,1,2) \\ A((0,3,0),(1,0,2)) ~ x(1,0,2) \\ 0 \\ 0 \\ A((1,2,0),(0,0,2)) ~ x(0,0,2) + A((1,2,0),(0,1,2)) ~ x(0,1,2)\\ A((1,3,0),(0,0,2)) ~ x(0,0,2) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 10 ~ x(1,0,2) + 30 ~ x(1,1,2) \\ 20 ~ x(1,0,2) \\ 0 \\ 0 \\ 14 ~ x(0,0,2) + 42 ~ x(0,1,2)\\ 28 ~ x(0,0,2) \end{array} \right] . $$

Shuffle Algorithm

This algorithm is based on Shuffle algebra (Davio, 1981). Therein, it is shown that $$ A = \prod_{k = 1}^{n} ( I_{c_1} \otimes \ldots \otimes I_{c_{k-1}} \otimes A_{k} \otimes I_{r_{k+1}} \otimes \ldots \otimes I_{r_{n}} ) = \prod_{k = 1}^{n} ( I_{c_1 \ldots c_{k-1}} \otimes A_{k} \otimes I_{r_{k+1} \ldots r_{n}} ), \tag{1}\label{kronecker_identity} $$ where $I_{u}$ denotes $(u \times u)$ identity matrix.

Shuffle algorithm initializes a temporary vector $z^{(n)}$ to $x$ initially. Then, it computes intermediate vectors $$ z^{(k-1)} = ( I_{c_1 \ldots c_{k-1}} \otimes A_{k} \otimes I_{r_{k+1} \ldots r_{n}} ) z^{(k)} \tag{2}\label{shuffle_step} $$ for each factor $k = n,\ldots,1$. At the final step, $y = z^{(0)}$ is obtained.

Now come more equations with many indices. Factor-$k$ in Equation (1) can be written element-wise as $$ ( I_{c_1 \ldots c_{k-1}} \otimes A_{k} \otimes I_{r_{k+1} \ldots r_{n}} )((i_1,\ldots,i_n),(j_1,\ldots,j_n)) = \left\{ \begin{array}{c} A_{k}(i_k,j_k) & \text{ if } i_{k'} = j_{k'} \text { for } k' \neq k \\ 0 & \text{ otherwise } \end{array} \right. $$ where \begin{align} i_{k'}, j_{k'} \in \{ 0,\ldots,c_{k'}-1 \} & ~~~ \text{ for } ~~ k' = 1,\ldots,k-1 \\ i_{k'} \in \{ 0,\ldots,r_{k'}-1 \} ~~~ \text{ and } ~~~ j_{k'} \in \{ 0,\ldots,c_{k'}-1 \} & ~~~ \text{ for } ~~ k' = k \\ i_{k'}, j_{k'} \in \{ 0,\ldots,r_{k'}-1 \} & ~~~ \text{ for } ~~ k' = k+1, \ldots, n . \end{align}

Then, letting $z^{(k)} \in \mathbb{R}^{c_{1} \ldots c_{k} r_{k+1} \ldots r_{n}}$, Equation (2) can be written as $$ z^{(k-1)}(i_1,\ldots,i_{k-1},:,i_{k+1},\ldots,i_n) = A_{k} z^{(k)}(i_1,\ldots,i_{k-1},:,i_{k+1},\ldots, i_n) \tag{3}\label{elementwise_shuffle_step} $$ where \begin{align} i_{k'} \in \{ 0,\ldots,c_{k'}-1 \} & ~~~ \text{ for } ~~ k' = 1,\ldots,k-1 \\ i_{k'} \in \{ 0,\ldots,r_{k'}-1 \} & ~~~ \text{ for } ~~ k' = k+1, \ldots, n . \end{align}

Finally, pseudocode of the shuffle algorithm can be written as

SHUFFLE(A_1,...,A_n,x,out y)
    1  Copy x to z_n
    2  for each factor k = n down to 1
    3      for each (i_1,...,i_(k-1)) = (0,...,0) to (c_1-1,...,c_(k-1)-1)
    4          for each (i_(k+1),...,i_(n)) = (0,...,0) to (r_(k+1)-1,...,r_n-1)
    5              Copy z_k(i_1,...,i_(k-1),:,i_(k+1),...,i_n) to u    
    6              Compute v = A_k u (see Equation (3))
    7              Copy v to z_(k-1)(i_1,...,i_(k-1),:,i_(k+1),...,i_n)
    8  Copy z_0 to y

Example (cont'd). We let $z^{(3)} = x$, and compute $z^{(2)} \in \mathbb{R}^{4 \times 1}$ at the first step. Subvectors of $z^{(2)}$ can be written as $$ z^{(2)}(i_1,i_2,:) = \left[ \begin{array}{c} z^{(2)}(i_1,i_2,0) \end{array} \right] = A_{3} z^{(3)}(i_1,i_2,:) = \left[ \begin{array}{c} 2 ~ z^{(3)}(i_1,i_2,2) \end{array} \right] = \left[ \begin{array}{c} 2 ~ x(i_1, i_2, 2) \end{array} \right] $$ for $i_1 = 0,1$ ($0$ through $c_1-1$) and $i_2 = 0,1$ ($0$ through $c_2-1$). Hence, $$ z^{(2)} = \left[ \begin{array}{c} z^{(2)}(0,0,0) \\ z^{(2)}(0,1,0) \\ z^{(2)}(1,0,0) \\ z^{(2)}(1,1,0) \end{array} \right] = \left[ \begin{array}{c} 2 ~ z^{(3)}(0,0,2) \\ 2 ~ z^{(3)}(0,1,2) \\ 2 ~ z^{(3)}(1,0,2) \\ 2 ~ z^{(3)}(1,1,2) \end{array} \right] = \left[ \begin{array}{c} 2 ~ x(0,0,2) \\ 2 ~ x(0,1,2) \\ 2 ~ x(1,0,2) \\ 2 ~ x(1,1,2) \end{array} \right] $$ After obtaining $z^{(2)}$, subvectors of $z^{(1)} \in \mathbb{R}^{8 \times 1}$ can be written as $$ z^{(1)}(i_1,:,i_3) = \left[ \begin{array}{c} z^{(1)}(i_1,0,i_3) \\ z^{(1)}(i_1,1,i_3) \\ z^{(1)}(i_1,2,i_3) \\ z^{(1)}(i_1,3,i_3) \end{array} \right] = A_{2} z^{(2)}(i_1,:,i_3) = \left[ \begin{array}{c} 0 \\ 0 \\ z^{(2)}(i_1,0,i_3) + 3 ~ z^{(2)}(i_1,1,i_3) \\ 2 ~ z^{(2)}(i_1,1,i_3) \end{array} \right] $$ for $i_1 = 0,1$ ($0$ through $c_1-1$) and $i_3 = 0$ ($0$ through $r_3-1$). Hence, $$ z^{(1)} = \left[ \begin{array}{c} z^{(1)}(0,0,0) \\ z^{(1)}(0,1,0) \\ z^{(1)}(0,2,0) \\ z^{(1)}(0,3,0) \\ z^{(1)}(1,0,0) \\ z^{(1)}(1,1,0) \\ z^{(1)}(1,2,0) \\ z^{(1)}(1,3,0) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ z^{(2)}(0,0,0) + 3 ~ z^{(2)}(0,1,0) \\ 2 z^{(2)}(0,1,0) \\ 0 \\ 0 \\ z^{(2)}(1,0,0) + 3 ~ z^{(2)}(1,1,0) \\ 2 ~ z^{(2)}(1,1,0) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 2 ~ x(0,0,2) + 6 ~ x(0,1,2) \\ 4 ~ x(0,1,2)\\ 0 \\ 0 \\ 2 ~ x(1,0,2) + 6 ~ x(1,1,2) \\ 4 ~ x(1,1,2) \end{array} \right] $$ After obtaining $z^{(1)}$, subvectors of $z^{(0)} \in \mathbb{R}^{8 \times 1}$ can be written as $$ z^{(0)}(:i_2,i_3) = \left[ \begin{array}{c} z^{(1)}(0,i_2,i_3) \\ z^{(1)}(1,i_2,i_3) \end{array} \right] = A_{1} z^{(1)}(:,i_2,i_3) = \left[ \begin{array}{c} 5 ~ z^{(1)}(1,i_2,i_3) \\ 7 ~ z^{(1)}(0,i_2,i_3) \end{array} \right] $$ for $i_2 = 0,1,2,3$ ($0$ through $r_2-1$) and $i_3 = 0$ ($0$ through $r_3-1$). Hence, finally, $$ y = z^{(0)} = \left[ \begin{array}{c} z^{(0)}(0,0,0) \\ z^{(0)}(0,1,0) \\ z^{(0)}(0,2,0) \\ z^{(0)}(0,3,0) \\ z^{(0)}(1,0,0) \\ z^{(0)}(1,1,0) \\ z^{(0)}(1,2,0) \\ z^{(0)}(1,3,0) \end{array} \right] = \left[ \begin{array}{c} 5 ~ z^{(1)}(1,0,0) \\ 5 ~ z^{(1)}(1,1,0) \\ 5 ~ z^{(1)}(1,2,0) \\ 5 ~ z^{(1)}(1,3,0) \\ 7 ~ z^{(1)}(0,0,0) \\ 7 ~ z^{(1)}(0,1,0) \\ 7 ~ z^{(1)}(0,2,0) \\ 7 ~ z^{(1)}(0,3,0) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 5 ~ (2 ~ x(1,0,2) + 6 ~ x(1,1,2)) \\ 5 ~ (4 ~ x(1,1,2)) \\ 0 \\ 0 \\ 7 ~ (2 ~ x(0,0,2) + 6 ~ x(0,1,2)) \\ 7 ~ (4 ~ x(0,1,2)) \end{array} \right] = \left[ \begin{array}{c} 0 \\ 0 \\ 10 ~ x(1,0,2) + 30 ~ x(1,1,2) \\ 20 ~ x(1,1,2) \\ 0 \\ 0 \\ 14 ~ x(0,0,2) + 42 ~ x(0,1,2) \\ 28 ~ x(0,1,2) \end{array} \right] $$

Pot-RwCl Algorithm

This algorithm initially proposed for square matrices but it can easily be expanded to rectangular matrices. The idea is to obtain (but not store) the elements of the large matrix recursively by visiting each element of the each factor and multiply with he corresponding vector element.

In the pseudocode of this algorithm, it is more convenient to use flat indices instead of multi-dimensional indices. Note that, multi-dimensional index of a vector $x \in \mathbb{R}^{c_1 \ldots c_n \times 1}$ corresponds to the flat index $\sum_{k=1}^{n} i_k \prod_{k'=k+1}^{n} c_{k'}$.

Psuedocode of this algorithm can be written as

POT-RWCL(A_1,...,A_n,x,out y)
    1  Initialize y to zero vector
    2  REC-POT-RWCL(0,0,0,1.0,A_1,...,A_n,x,inout y)

REC-POT-RWCL(k,rindex,cindex,value,A_1,...,A_n,x,inout y)
    1  for each nonzero element of A_k(i_k,j_k)
    2      Compute rindex = rindex * r_k + i_k
    3      Compute cindex = cindex * c_k + j_k
    4      Compute value = value * A_k(i_k,j_k)  
    5      if k == n
    6          Compute y(rindex) = value * x(cindex)
    7      else
    8          REC-POT-RWCL(k+1,rindex,cindex,value,A_1,...,A_n,x,out y)

Example (cont'd) To increase readibility, I will stick to multi-dimensional indices, but in the actual implementation, the flat indices are carried into the method calls. In this approach, nonzero elements of $A$ are obtained recursively and multiplied with the corresponding element of $x$.

Initially all elements of $y$ are initialized to zero. The recursive method REC-POT-RWCL is called for each nonzero element of $A_1$, $A_2$, and $A_3$. The following are the method calls and updates to $y$:

REC-POT-RWCL is called with (k = 1, value = 1)
    REC-POT-RWCL is called with (k = 2, value = A_1(0,1))
        REC-POT-RWCL is called with (k = 3, value = A_1(0,1) A_2(2,0))
            y(0,2,0) = y(0,2,0) + A_1(0,1) A_2(2,0) A_3(0,2) x(1,0,2)
    REC-POT-RWCL is called with (k = 2, value = A_1(0,1))
        REC-POT-RWCL is called with (k = 3, value = A_1(0,1) A_2(2,1))
            y(0,2,0) = y(0,2,0) + A_1(0,1) A_2(2,1) A_3(0,2) x(1,1,2)
    REC-POT-RWCL is called with (k = 2, value = A_1(0,1))
        REC-POT-RWCL is called with (k = 3, value = A_1(0,1) A_2(3,0))
            y(0,3,0) = y(0,3,0) + A_1(0,1) A_2(3,0) A_3(0,2)
REC-POT-RWCL is called with (k = 1, value = 1)
    REC-POT-RWCL is called with (k = 2, value = A_1(1,0))
        REC-POT-RWCL is called with (k = 3, value = A_1(1,0) A_2(2,0))
            y(1,2,0) = y(1,2,0) + A_1(1,0) A_2(2,0) A_3(0,2) x(0,0,2)
    REC-POT-RWCL is called with (k = 2, value = A_1(1,0))
        REC-POT-RWCL is called with (k = 3, value = A_1(1,0) A_2(2,1))
            y(1,2,0) = y(1,2,0) + A_1(1,0) A_2(2,1) A_3(0,2) x(0,1,2)
    REC-POT-RWCL is called with (k = 2, value = A_1(1,0))
        REC-POT-RWCL is called with (k = 3, value = A_1(1,0) A_2(3,0))
            y(1,3,0) = y(1,3,0) + A_1(1,0) A_2(3,0) A_3(0,2) x(0,0,2)

References

Davio, M. (1981). Kronecker products and shuffle algebra. IEEE Transactions on Computers, 100(2), 116-125. doi: 10.1109/TC.1981.6312174

Plateau, B. (1985). On the stochastic structure of parallelism and synchronization models for distributed algorithms. SIGMETRICS Performance Evaluation Review, 13(2), 147–154. doi: 10.1145/317786.317819

Buchholz, P., Ciardo, G., Donatelli, S., & Kemper, P. (2000). Complexity of memory-efficient Kronecker operations with applications to the solution of Markov models. INFORMS Journal on Computing, 12(3), 203-222. doi: 10.1287/ijoc.12.3.203.12634

Dayar, T. (2012). Analyzing Markov chains using Kronecker products: theory and applications. Springer Science & Business Media. h10.1007/978-1-4614-4190-8

Dayar, T., & Orhan, M. C. (2015). On vector-Kronecker product multiplication with rectangular factors. SIAM Journal on Scientific Computing, 37(5), S526-S543. 10.1137/140980326

Fackler, P. L. (2019). Algorithm 993: Efficient Computation with Kronecker Products. ACM Transactions on Mathematical Software (TOMS), 45(2), 1-9. 10.1145/3291041

Vector Multiplication with Multiple Kronecker Products

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in KRONECKER-PRODUCT

Trending Questions

Popular # Hahtags

Popular Questions