Proof of vectorization of product of three matrices

823 Views Asked by At

On Wikipedia, it is written that $$\mathrm{vec}(ABC) = (C^\top\otimes A)\mathrm{vec}(B)$$ I am trying to show that it is true, but couldn't succeed. $A$ is $m\times n$, $B$ is $n\times p$ and $C$ is $p\times q$ matrix.

My attempts

The $ij$th element of the matrix $ABC$ is $$\sum_{k=1}^n\sum_{l=1}^pa_{ik}b_{kl}c_{lj}$$ Now, since $ABC$ is an $m\times q$ matrix, so if

$$i = am+b;\hspace{1cm}a\in\Bbb N,0<b<m$$ then the $i$th element of $\mathrm{vec}(ABC)$ is actually the element in the $b$th row and $a+1$th column in the matrix $ABC$ which can be found by the above double sum. But I don't know what to do for the RHS.

Please help

1

There are 1 best solutions below

4
On BEST ANSWER

Here's an approach I like, which avoids getting bogged down in summation and the vectorization reindexing wherein the $i,j$ entry of an $m \times n$ matrix becomes the $[1 + i + m(j-1)]$th entry of a length $mn$ column-vector.

We first observe that the maps $B \mapsto \operatorname{vec}(ABC)$ and $B \mapsto (C^T \otimes A)\operatorname{vec}(B)$ are linear maps from $\Bbb R^{n \times p}$ to $\Bbb R^{mq}$. In order to show that these maps are identical, it suffices to show that they agree on a basis.

One particularly nice basis of $\Bbb R^{n \times p} = \{E_{ij}: 1 \leq i \leq n, 1 \leq j \leq p\}$, where $E_{ij}$ denotes the matrix that has a $1$ as its $(i,j)$ entry and zeros everywhere else. We make two observations:

  • $E_{ij} = e_i^{(n)}{e_j^{(p)}}^T$ where $e_1^{(n)},\dots,e_n^{(n)}$ denotes the standard basis of $\Bbb R^n$,
  • $\operatorname{vec}(uv^T) = v \otimes u$ for any vectors $u,v$ (this in turn can be proven by first considering the case where $u,v$ are standard basis vectors), then making an argument by linearity.

We then have $$ \begin{align*} (C^T \otimes A)\operatorname{vec}(E_{ij}) &= (C^T \otimes A)\operatorname{vec}(e_i^{(n)}[e_j^{(p)}]^T) \\ &= (C^T \otimes A)(e_j^{(p)} \otimes e_i^{(n)}) \\ & = (C^Te_j^{(p)}) \otimes (Ae_i^{(n)}). \end{align*} $$ On the other hand, $$ \operatorname{vec}(AE_{ij}C) = \operatorname{vec}(Ae_i^{(n)}[e_j^{(p)}]^TC)= \operatorname{vec}([Ae_i^{(n)}][C^Te_j^{(p)}]^T) = (C^Te_j^{(p)}) \otimes (Ae_i^{(n)}). $$ The conclusion follows.