I am trying to prove equation (7) in Fast Algorithms for Convolutional Neural Networks
Could anyone elaborate on the mathematical relationship between Vandermonte Matrix and winograd convolution ?
I am trying to prove equation (7) in Fast Algorithms for Convolutional Neural Networks
Could anyone elaborate on the mathematical relationship between Vandermonte Matrix and winograd convolution ?
Copyright © 2021 JogjaFile Inc.
Here are the transcribed equations from Fast Algorithms for Convolutional Neural Networks.
$F(2, 3)=\begin{bmatrix} d_0 & d_1 & d_2\\ d_1 & d_2 & d_3 \end{bmatrix}\begin{bmatrix} g_0\\ g_1\\ g_2 \end{bmatrix}=\begin{bmatrix} m_1+m_2+m_3\\ m_2-m_3-m_4 \end{bmatrix}$
where
$\begin{align} m_1&=(d_0-d_2)g_0\\ m_2&=(d_1+d_2)\frac{g_0+g_1+g_2}{2}\\ m_3&=(d_2-d_1)\frac{g_0-g_1+g_2}{2}\\ m_4&=(d_1-d_3)g_2 \end{align}$
As this should be easy to derive out as true, I would continue with explaining equation (7).
$\begin{align} B^T&=\begin{bmatrix} 1&0&-1&0\\ 0&1&1&0\\ 0&-1&1&0\\ 0&1&0&-1 \end{bmatrix}\\ G&=\begin{bmatrix} 1&0&0\\ \frac{1}{2}&\frac{1}{2}&\frac{1}{2}\\ \frac{1}{2}&-\frac{1}{2}&\frac{1}{2}\\ 0&0&1 \end{bmatrix}\\ A^T&=\begin{bmatrix} 1&1&1&0\\ 0&1&-1&-1 \end{bmatrix}\\ g&=\begin{bmatrix} g_0&g_1&g_2 \end{bmatrix}^T\\ d&=\begin{bmatrix} d_0&d_1&d_2&d_3 \end{bmatrix}^T \end{align}$
So, plugging these in into equation (6), we get:
$Y=A^T[(Gg)⊙(B^Td)]\\ =A^T\left[\left(\begin{bmatrix} 1&0&0\\ \frac{1}{2}&\frac{1}{2}&\frac{1}{2}\\ \frac{1}{2}&-\frac{1}{2}&\frac{1}{2}\\ 0&0&1 \end{bmatrix}\begin{bmatrix} g_0\\ g_1\\ g_2 \end{bmatrix}\right)⊙\left(\begin{bmatrix} 1&0&-1&0\\ 0&1&1&0\\ 0&-1&1&0\\ 0&1&0&-1 \end{bmatrix}\begin{bmatrix} d_0\\ d_1\\ d_2\\ d_3 \end{bmatrix}\right)\right]\\ =A^T\left(\begin{bmatrix} g_0\\ \frac{g_0+g_1+g_2}{2}\\ \frac{g_0-g_1+g_2}{2}\\ g_2 \end{bmatrix}⊙\begin{bmatrix} d_0-d_2\\ d_1+d_2\\ d_2-d_1\\ d_1-d_3 \end{bmatrix}\right)\\ =A^T\begin{bmatrix} m_1\\ m_2\\ m_3\\ m_4 \end{bmatrix}\\ =\begin{bmatrix} m_1+m_2+m_3\\ m_2-m_3-m_4 \end{bmatrix}$
As for the mathematical relationship between Vandermonde matrix and Winograd convolution, no idea. However, I did spot equation (15), which looks like a Vandermonde matrix.
$A^T=\begin{bmatrix} 1&1&1&1&1&0\\ 0&1&-1&2&-2&0\\ 0&1&1&4&4&0\\ 0&1&-1&8&-8&1 \end{bmatrix}$