Computing a summation of products more efficiently via factorization

46 Views Asked by At

Let $P = \{p_1, \dots, p_n\}$ (may be a multiset). I wish to compute $\sum\limits_{S \subseteq P \\ |S| = k} \prod\limits_{i \in S} i \prod\limits_{i \not\in S} (1-i)$ in an efficient manner. The naive approach takes time proportional to $\binom{n}{k},$ which makes it too slow for the case $n=200, k=100.$

The case where all elements of $P$ are equal is trivial, but even the case where $n$ elements are $p$ and $n$ elements are $q$ reduces to computing $\sum\limits_{i=0}^k \binom{n}{i}\binom{n}{k-i}p^i (1-p)^{n-i} q^{k-i} (1-q)^{n-k+i},$ for which the evaluation of the tricky summation $\sum\limits_{i=0}^k \binom{n}{i}\binom{n}{k-i} x^i$ is required.

2

There are 2 best solutions below

6
On

You can factor this one element at a time. First let’s simplify the problem by dividing by $\prod_{i\in P}(1-i)$; then you just need to determine the sum

$$ a(Q,k)=\sum_{S\subseteq Q\atop|S|=k}\prod_{i\in S}i $$

for $Q=f(P)$, with $f(x)=\frac x{1-x}$. Start with the first element and write

$$ a(Q,k)=q_1a\left(Q\setminus\{q_1\},k-1\right)+a\left(Q\setminus\{q_1\},k\right)\;, $$

and continue recursively for all elements. That yields a Pascalesque triangle in which you only have to perform $O\left(n^2\right)$ operations. Instead of just adding two elements as you do for Pascal’s triangle, here you multiply the left element with a factor $q_i$ associated with the row before adding it to the right element.

Another way to think about it is in terms of generating functions: You’re looking for the coefficient of $x^k$ in

$$ \prod_{i\in Q}(1+ix)\;, $$

or, in your original formulation with $P$, the coefficient of $x^k$ in

$$ \prod_{i\in P}(1-i+ix)\;. $$

Of course you can also write the triangle approach directly in terms of $P$:

$$ a(P,k)=p_1a\left(P\setminus\{p_1\},k-1\right)+(1-p_1)a\left(P\setminus\{p_1\},k\right)\;; $$

it just seemed easier to think of it at first if we only have factors for the selected elements.

0
On

We can make use of generating functions here. Suppose we are given two lists or multisets of equal length $\, P = (p_1,\ldots,p_n)\,$ and $\, Q = (q_1,\dots,q_n).\,$ Use the useful notation $\,[n] := \{1,\ldots,n\}.\,$ Define the sum of products $$ a_k(P,Q) = a_k := \sum_{S\subseteq [n],|S|=k} \prod_{i\in S}P_i \prod_{i\notin S}Q_i. $$ Define the generating function of $\,a_k\,$ to be the polynomial $$ f(x) := \sum_{k=0}^n a_k(P,Q)x^k = \prod_{i=1}^n (x\,p_i+y_i) $$ where the equality to the product follows from the definition of $\,a_k\,$ and the distributivity of multiplication over addition. We want to calculate some of the coefficients of $\,f(x).\,$ That is, $\,a_k=[x^k]f(x).\,$ An efficient approach is to truncate $\,f(x)\,$ and keep coefficients up to order $\,x^k\,$ when evaluating the product. This is a generalization of the binomial theorem for evaluating $\,(p+q)^n\,$ using binomial coefficients and computing them by the usual Pascal triangle recursion.