I need to be able to explain the complexity of three Fast Fourier Transform algorithms: Cooley-Tukey's, Bluestein's and Prime-factor algorithm.
Unfortunatelly, I'm a little lost in the process.
Discrete Fourier Transform general formula
\begin{align} x &= \{x_0, ... , x_{N-1}\}\\ 0 &\leq n \leq N-1 \\ X_n &= \sum^{N-1}_{k=0}x_k \cdot e^{\frac{-i\cdot 2\pi \cdot k \cdot n}{N}} \end{align}
Here's the complexity pretty clear (at least I think :-) ):
- We have a fixed number of multiplications and one division in the sum $\Rightarrow$ complexity of the sum content is $$O(1)$$
- The sum itself contains a loop from 0 to N-1, so the loop runs N-times $\Rightarrow$ the complexity of the whole sum is $$O(N) \cdot O(1) = O(N)$$
- And now, we have this sum for every n $\Rightarrow$ the sum will be "executed" N-times (the size of n is N) $\Rightarrow$ The complexity is $$O(N) \cdot O(N) = O (N^2)$$
Cooley-Tukey's algorithm
The point is in dividing the sum according to the Danielson-Lanczos lemma: $$ X_n = \sum^{\frac{N}{2}-1}_{k=0} x_{2n} \cdot e^{\frac{-i\cdot 2\pi \cdot k \cdot n}{\frac{N}{2}}} + e^{\frac{-i \cdot 2\pi \cdot n}{N}}\sum^{\frac{N}{2}-1}_{k=0} x_{2n+1} \cdot e^{\frac{-i\cdot 2\pi \cdot k \cdot n}{\frac{N}{2}}} $$
Every sum can be recursively divided $log_2(N)$ times:
N
N/2 N/2
N/4 N/4 N/4 N/4
...
1 ... 1
- Every sum has $O(1)$ complexity as above, so the only thing that matters is the number of iterations: $O(N), O(2\cdot \frac{N}{2}), O(4 \cdot \frac{N}{4})$... So, every step has the complexity $$O(N)$$.
- We have $log_2(N)$ steps, so the complexity of $X_N$ will be $$O(N \cdot log N)$$
- But still, this is the result just for one $X_n \Rightarrow$ Why is not the complexity multiplied by 'n' again? (as in the step 3 in the general DFT) Or is there a mistake in previous steps?
Bluestein's algorithm
The point is here in rewriting the formula to the "special" shape, where the sum is the convolution of two sequences.
- The multiplication of the sum and the exponential before it has a constant complexity $$O(1)$$
- Because the sum is the convolution, we can use the convolution theorem, so we must compute Fourier transforms for both sequences (which can need to be padded with zeros) and multiply them. Multiplication and zero padding are $O(1)$, Fourier transforms can be again $O(N \cdot log N)$ (we'll use Cooley-Tukey for them), so the complexity is $$ 2 \cdot O(N \cdot log N) + O(1) = O(N \cdot log N) $$
- And again - this is just for one $X_n$ - Why shouldn't I multiply the complexity by O(N) for every 'n'?
Prime-factor algorithm
And here I have simply no idea how should I compute the complexity... I'd say that two nested sums are like two loops, so $O(N^2)$, but it's obviously wrong...
The FFT complexity is usually given in terms of number of multiplications ($mul$) and additions ($add$).
General formula
That's correct $$O(N^2 \times mul + N^2 \times add)$$
Cooley-Tukey
First remember the algorithm: $$X_k = E_k + t_k O_k, k \le \frac N 2$$ $$X_k = E_{k-\frac N 2} - t_{k-\frac N 2}O_{k-\frac N 2}, k > \frac N 2$$ where $E$ is DFT of the even indexed samples, $O$ DFT of the odd indexed samples, and $t$ is the twiddle factor.
We also know that $N = 2^L,L\ge 0$
The number of operations to compute $X$ of size $N=2^L$ if $E,O,t$ are known is: $$2^{L-1} \times mul + 2^{L} \times add$$ To compute the $E$ and $O$ we need to do the computation twice but of size $2^{L-1}$ and so on. For a $i \le L$ we have to do the computation $2^{L-i}$ times.
Putting this together the total complexity is: $$\sum_{i=1}^{L}2^{L-i}O(2^{i-1} \times mul + 2^{i} \times add) = \sum_{i=1}^{L}O(2^{L-1} \times mul + 2^{L} \times add) = O(L\cdot 2^{L-1} \times mul + L \cdot 2^{L} \times add) = O(log(N)\frac N 2 \times mul + log(N)N\times add)$$
Bluestein's algorithm
Unfortunately, I'm not familiar with this algorithm
Prime-factor algorithm
$$N = N_1\cdot N_2$$ where $N_1$ and $N_2$ are relatively prime.
Notice that the algorithm is essentially DFT applied twice. The inner sum is DFT of size $N_1$ that has to be calculated for each $n_2$ (i.e. $N_2$ times). The outer sum is DFT of size $N_2$ calculated $N_1$ times. The complexity thus is:
$$O(N_2\cdot O(DFT_{N_1}) + N_1 \cdot O(DFT_{N_2})) = O(N_2\cdot N_1^2\times (mul + add) + N_1 \cdot N_2^2\times (mul + add)) = O(N\cdot (N_1+N_2)\times (mul + add)) $$
For trivial factorization $N_1=N,N_2=1$, the complexity is $O(N^2)$, but for $N_1$ and $N_2$ of similar size, this is close to $O(N \sqrt N)$.