Does parallelism factor in when deriving the computational complexity of a parallel algorithm?
Suppose I have a perfect binary tree $T$ with leaves numbered $1$ to $n$, and an algorithm $\operatorname{Sum}()$ that computes the sum $1 + 2 + \dotsb + n$ in this way:
$\operatorname{Sum}(T)=$
If $T$ is a leaf node $k$, return $k$. Otherwise,
- $a \leftarrow \operatorname{Sum}(\text{left subtree of }T)$
- $b \leftarrow \operatorname{Sum}(\text{right subtree of }T)$
- Return $a + b$.
The total number of additions performed is $2^0 + 2^1 + 2^2 + \dotsb + n/2$. However, if steps (1) and (2) are performed in parallel, is the computational complexity $O(\log n)$ instead?
You (or the compiler) would have to rewrite the algorithm, but as long as you had at least $n/2$ processors and communication between them could be done in constant time, then the parallel version would indeed have running time $O(\log n)$, since all of the work at a given level could be done (in parallel) at the same time.
Added. If we're restricted to a fixed number, $k$, of processors we first observe that the top $k+1$ layers (consisting of $O(k)$ nodes) can be added in $O(\log k)$ steps in parallel, as I noted above. However, for a tree with $N>2^k$ nodes, that will still leave us with $O(N-2^k)$ additions to perform and we can only compute these in $O((N-2^k)/k)$ time in parallel. Consequently, it will take $O(\log k +(N-2^k)/k)=O(N)$ steps, in general, to sum the nodes in a tree of size $N$.