Proof of winograd convolution

500 Views Asked by At

I am trying to prove equation (7) in Fast Algorithms for Convolutional Neural Networks

Could anyone elaborate on the mathematical relationship between Vandermonte Matrix and winograd convolution ?

1

There are 1 best solutions below

13
On

Here are the transcribed equations from Fast Algorithms for Convolutional Neural Networks.

$F(2, 3)=\begin{bmatrix} d_0 & d_1 & d_2\\ d_1 & d_2 & d_3 \end{bmatrix}\begin{bmatrix} g_0\\ g_1\\ g_2 \end{bmatrix}=\begin{bmatrix} m_1+m_2+m_3\\ m_2-m_3-m_4 \end{bmatrix}$

where

$\begin{align} m_1&=(d_0-d_2)g_0\\ m_2&=(d_1+d_2)\frac{g_0+g_1+g_2}{2}\\ m_3&=(d_2-d_1)\frac{g_0-g_1+g_2}{2}\\ m_4&=(d_1-d_3)g_2 \end{align}$

As this should be easy to derive out as true, I would continue with explaining equation (7).

$\begin{align} B^T&=\begin{bmatrix} 1&0&-1&0\\ 0&1&1&0\\ 0&-1&1&0\\ 0&1&0&-1 \end{bmatrix}\\ G&=\begin{bmatrix} 1&0&0\\ \frac{1}{2}&\frac{1}{2}&\frac{1}{2}\\ \frac{1}{2}&-\frac{1}{2}&\frac{1}{2}\\ 0&0&1 \end{bmatrix}\\ A^T&=\begin{bmatrix} 1&1&1&0\\ 0&1&-1&-1 \end{bmatrix}\\ g&=\begin{bmatrix} g_0&g_1&g_2 \end{bmatrix}^T\\ d&=\begin{bmatrix} d_0&d_1&d_2&d_3 \end{bmatrix}^T \end{align}$

So, plugging these in into equation (6), we get:

$Y=A^T[(Gg)⊙(B^Td)]\\ =A^T\left[\left(\begin{bmatrix} 1&0&0\\ \frac{1}{2}&\frac{1}{2}&\frac{1}{2}\\ \frac{1}{2}&-\frac{1}{2}&\frac{1}{2}\\ 0&0&1 \end{bmatrix}\begin{bmatrix} g_0\\ g_1\\ g_2 \end{bmatrix}\right)⊙\left(\begin{bmatrix} 1&0&-1&0\\ 0&1&1&0\\ 0&-1&1&0\\ 0&1&0&-1 \end{bmatrix}\begin{bmatrix} d_0\\ d_1\\ d_2\\ d_3 \end{bmatrix}\right)\right]\\ =A^T\left(\begin{bmatrix} g_0\\ \frac{g_0+g_1+g_2}{2}\\ \frac{g_0-g_1+g_2}{2}\\ g_2 \end{bmatrix}⊙\begin{bmatrix} d_0-d_2\\ d_1+d_2\\ d_2-d_1\\ d_1-d_3 \end{bmatrix}\right)\\ =A^T\begin{bmatrix} m_1\\ m_2\\ m_3\\ m_4 \end{bmatrix}\\ =\begin{bmatrix} m_1+m_2+m_3\\ m_2-m_3-m_4 \end{bmatrix}$

As for the mathematical relationship between Vandermonde matrix and Winograd convolution, no idea. However, I did spot equation (15), which looks like a Vandermonde matrix.

$A^T=\begin{bmatrix} 1&1&1&1&1&0\\ 0&1&-1&2&-2&0\\ 0&1&1&4&4&0\\ 0&1&-1&8&-8&1 \end{bmatrix}$