Disadvantage of Cook-Toom algorithm

66 Views Asked by At
1

There are 1 best solutions below

0
On

Since this question originated from another one of your questions, I assume that you are talking about usage of Toom-Cook algorithm in generic convolution instead of integer multiplication.

As summarized by the Wikipedia article Toom-Cook multiplication, the algorithm consists of the following steps:

  1. Splitting
  2. Evaluation
  3. Pointwise multiplication
  4. Interpolation
  5. Recomposition

For generic convolution, the recomposition step is not needed so there are only 4 steps. Now, the problem with Toom-Cook algorithm is with step 2 and step 4, which involves the pseudo-Vandermonde matrix. The article uses Toom-3 as example, for which the evaluation matrix looks like this $$ \begin{pmatrix} 1 & 0 & 0 \\ 1 & 1 & 1 \\ 1 & -1 & 1 \\ 1 & -2 & 4 \\ 0 & 0 & 1 \\ \end{pmatrix} $$ This comes from evaluating the polynomial $p(x)=m_0 + m_1x + m_2x^2$ at five different values $p(0)$, $p(1)$, $p(-1)$, $p(-2)$, $p(\infty)$. The interpolation uses the following matrix inversion

$$ \begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ 1 & 1 & 1 & 1 & 1\\ 1 & -1 & 1 & -1 & 1\\ 1 & -2 & 4 & -8 & 16\\ 0 & 0 & 0 & 0 & 1 \\ \end{pmatrix}^{-1}=\begin{pmatrix} 1 & 0 & 0 & 0 & 0\\ {1\over2} & {1\over3} & -1 & {1\over6} & -2\\ -1 & {1\over2} & {1\over2} & 0 & -1\\ -{1\over2} & {1\over6} & {1\over2} & {-1\over6} & 2\\ 0 & 0 & 0 & 0 & 1 \\ \end{pmatrix} $$

Now, supposedly Toom-3's advantage is that in reduce 9 multiplications to 5. Indeed, there are only five multiplications in step 3. But what is not mentioned is the multiplications that crop up in step 2 and 4. In step 2 though, one can be careful to choose to evaluate at $x=2^k$ for any integer $k$, so that the multiplications can be carried out using bit-shifts. But in step 4, multiplication (or division) with non-powers of 2 is unavoidable. A naive multiplication with the inverted matrix will require at least 4 multiplications (for the ones involving $1/3$ and $1/6$). Following Bodrato's sequence as suggested by the article will require only 1 extra multiplication instead of 4.

The inverted matrix for Toom-3 already looks complicated enough. The matrix for higher-order Toom-Cook will look even worse. In Toom-$k$, the theoretical reduction for multiplication count is $2k-1$ reduced to $k$, but in practice it will require more than $k$ multiplications even if you have an optimal interpolation sequence like Bodrato's.

Another downside is the loss of accuracy. As you use larger $k$, the the size of bit-shifts will also increase. This is because you need to evaluate at larger points $p(\pm2^{j})$ where $j<k-1$, and also because the degree of the polynomial becomes larger. With the increasing bit-shift amount, the intermediate numbers involved lose precision, so the final result is also less accurate.