An approximation of combinatorials using the geometric mean?

66 Views Asked by At

I'm reading a paper that uses a surprisingly accurate way of approximating a certain combinatorial structure. I'd like to try and understand how this is so effective before utilising it myself but I just can't see how it works so well!

The problem is like so:

Consider a sequence of $m$ numbers $\{ a_1, a_2, ... , a_m \}$, where $0 <a_i < 1$ for every $i$. Consider also the following expression that averages all possible product combinations of the $a_i$ values, from $1$ to $m$. i.e.

\begin{equation} 1 + \frac{1}{\binom{m}{1}} ( a_1 + a_2 + ... + a_m ) + \frac{1}{\binom{m}{2}} ( a_1 a_2 + a_1 a_3 + ... a_{m-1} a_m ) + ... + (a_1 a_2 ... a_m) \end{equation}

This can become incredibly computationally expensive for large $m$, so the paper says that this can be approximated using: \begin{equation} (a_1 a_2 ... a_m)^{0} + (a_1 a_2 ... a_m)^{\frac{1}{m}} + (a_1 a_2 ... a_m)^{\frac{2}{m}} + ... + + (a_1 a_2 ... a_m)^{1} \end{equation}

The paper claims that this approximation is always correct to within 3% for $m$ up to 50 when tested. Anyone think they can offer an explanation as to why this is a sensible approximation? The appearance of binomial coefficients makes me think there's some sort of clever expansion being used but I can't find anything relevant.

The paper simply quotes that "A good approximation of <the above> uses the fact that the calculation of <the above> involves multiplications of all combinations of $a_i$. Therefore, we might reduce <the above> to a geometric series using the geometric average of the respective $a_i$ values."

An example for $m = 3$ is that: \begin{equation} 1 + \frac{1}{3} (a_1 + a_2 + a_3) + \frac{1}{3} (a_1 a_2 + a_1 a_3 + a_2 a_3) + (a_1 a_2 a_3) \end{equation} is approximated by \begin{equation} (a_1 a_2 a_3)^0 + (a_1 a_2 a_3)^{\frac{1}{3}} + (a_1 a_2 a_3)^{\frac{2}{3}} + (a_1 a_2 a_3) \end{equation}

As a final note, the application is that a competition matrix is being transformed into a transition matrix, though I've reduced the problem above to it's bare bones. Paper is here however.

Many thanks for any insight that can be given!

1

There are 1 best solutions below

1
On BEST ANSWER

Testing the claim for $m=2$, let $f,g$ be given by \begin{align*} f&=1+\Bigl({\small{\frac{1}{2}}}\Bigr)(a_1+a_2)+a_1a_2\\[4pt] g&=1+(a_1a_2)^{\large{\frac{1}{2}}}+a_1a_2\\[4pt] \end{align*} where $0 < a_1,a_2 < 1$.

The author's claim is that $g$ approximates $f$ to within $3$%.

But if $a_1$ approaches $1$ from below, and $a_2$ approaches $0$ from above, the ratio ${\large{\frac{g}{f}}}$ approaches ${\large{\frac{2}{3}}}$, which is an error of approximately $33$%.

But how good would the claimed approximation be on average if $a_1,...,a_m$ are chosen independently and uniformly at random from $(0,1)$?

For each positive integer $m$, let $f_m$ be the exact expression, and let $g_m$ be the expression the author wants to use as an approximation.

For $2\le m\le 5$, using random values for $a_1,...,a_m$, we get the following simulation results . . . \begin{array}{c|c|c} m&\;\text{avg of}\;{\large{\frac{g_m}{f_m}}}\;{\vphantom{\frac{.}{{\LARGE{A_B}}}}}&\;\text{avg error}\;\\ \hline 2&.9632\;&3.7\text{%}\\ \hline 3&.9348&6.5\text{%}\\ \hline 4&.9108&8.9\text{%}\\ \hline 5&.8935&10.7\text{%}\\ \hline \end{array} ${\vphantom{{\Large{\binom{A}{B}}}}}$so the author's $3$% claim appears to fail even for random values of $a_1,...,a_m$.