Why should ${\sum_1^n {s_i} \over \sum_1^n {t_i}} = {n \over \sum_1^n{t_i \over s_i}}$? (harmonic mean)

139 Views Asked by At

I realized it reading about computer performance. This identity is presented there with words: dividing total distance by total time, you arrive at average execution rate on the left. On the right, you can achieve the same with harmonic mean. I do not understand how this is proved.

I think that I can prove it in the case where distance is constant, $s_i = (s,s,s,\ldots)$: $${\sum_1^ns_i \over \sum_1^n t_i} = {n \over \sum_1^n{t_i/s_i}}$$

because $${\sum_1^n {s_i} \over \sum_1^n {t_i}} = {ns \over \sum_1^n {(s/s) t_i}} = {n \over \sum{t_i \over s}} = {n \over \sum{t_i \over s_i}}$$ in this case.

This is confirmed by wikipedia article, which says that harmonic mean works for equidistant trips , $s_1 = s_2 = \ldots = s$, in Algebra and Physics. However, the book shows that this identity also holds for the varying travel distances $s_i$ over tests $i$, $s_i = [130,160,115,252,187]$ (with corresponding execution times $t_i=[321,436,284,601,482]$, which results in execution rates of $v_i = s_i/t_i = [405, 367,405,419,388]/1000$) in table Table 3.2, at page 32! Yet, the identity seems to hold because this is what book tells and I have checked that numerically, $$\sum_1^n{s_i \over t_i} = {130+160+115+252+187 \over 321+436+284+601+482 } = {844 \over 2124} = 0.39$$ is equal to $$ {n \over \sum{t_i/s_i}} = {5/(1/405 + 1/367+ 1/405+ 1/419+ 1/388)} = 0.39$$ indeed.

So, there seems no need to constrain harmonic mean to the cases of constant workload. It can be applied universally.

Surprisingly, when I plug arbitrary numbers, say $s_1=20, s_2 = 30, t_1 = 10, t_2 = 90$, get the average rate

$$Rate_{average} = \frac{s_1+s_2}{t_1+t_2} = {20+30 \over 10+90} = 50/100 = 1/2$$

whereas harmonic rate diverges

$$Rate_{harmonic} = {n \over {{t_1 \over s_1} + {t_2 \over t_2}}} = {2 \over {{10\over 20} + {90 \over 30}}} = {2 \over 1/2 + 3} = 4/7 \neq Rate_{average}$$

What is wrong with the book's method? Which of the average rates is correct? Why the all discrepancy they get is rounding error while I see something more substantial?

update

The book's author has contacted me and asked excuse for inconvenience. He has a list of errata and particularly weighted harmonic mean had to be used in the example. I still do not understand how it happened that the simple data in the table work with the simple harmonic mean? This closes the question. Thanks for appropriate answers. I've picked one at random.

3

There are 3 best solutions below

0
On BEST ANSWER

I have failed to open the book on your link, so I am missing a context and the authors exact claims. However,

What is wrong with the book's method?

As @Avitus has explained, the statement doesn't hold in general. You yourself have shown that it holds when $s_i = s_j$ for all $i,j$. If this doesn't hold, but your input is "close" (in some sense) to such constant case, your result will also be close (but not the same). Note that, for your example:

$$\frac{844}{2124} \approx 0.39736346516007532956,$$

which seems close, but is not equal to

$$\frac{5}{\frac{321}{130} + \frac{436}{160} + \frac{284}{115} + \frac{601}{252} + \frac{482}{187}} \approx 0.39600018497296518509.$$

Which of the average rates is correct?

When it comes to averages, there is no "correct one". Each has certain interpretations and properties, and which one you choose is a matter of your aims and assumptions.

Why the all discrepancy they get is rounding error while I see something more substantial?

In the above example, obviously. Both results are $0.39$, when floored to $2$ decimal digits, but take more, and the result is not correct. In your second example, $4/7 \approx 1/2$ (up to the first decimal digit), so it's not all that different.

If you go very far from a "balanced" case, you get nowhere near the formula. For example:

$$s = (1, 100), \quad t = (1, 2)$$

yields

$$\frac{1+100}{1+2} = \frac{101}{3} = 33.\overline{6} \not \approx 1.96078431372549019607 \approx \frac{2}{\frac{1}{1} + \frac{2}{100}}.$$

0
On

The statement without further assumptions on $t_i$'s and $s_i$'s is not true, in general. Let

$$H(s_1,\dots,s_n)=\frac{n}{\sum_i \frac{1}{s_i}}$$

be the harmonic mean of $(s_1,\dots,s_n)$. The weigthed harmonic mean with weights $T=(t_1,\dots,t_n)$ is $$H_T(s_1,\dots,s_n)=\frac{\sum_i t_i}{\sum_i \frac{t_i}{s_i}}. $$

We have

$$H_T(s_1,\dots,s_n)=H(s_1,\dots,s_n) $$

if $T=(t,t,\dots,t)$, $t\neq 0$.

In this setting the statement

$$\frac{\sum_i s_i}{\sum_i t_i}=\frac{n}{\sum_i \frac{t_i}{s_i}} $$

is equivalent to

$$\frac{\sum_i s_i}{\sum_i t_i}=\frac{n}{\sum_i t_i}H_T(s_1,\dots,s_n)$$

or

$$H_T(s_1,\dots,s_n)=\frac{\sum_i s_i}{n},$$

which is not true, as the r.h.s. is independent of $T$. For example, with $n=2$, $s=(1,0.5)$ and $T=(1,1)$ one arrives at the contradiction

$$\frac{2}{3}=H_T(1,0.5)=\frac{3}{4}. $$

0
On

Introducing $u_k=t_k/s_k$, this is equivalent to the assertion that $$ \sum_{k=1}^ns_k\cdot\sum_{k=1}^nu_k=n\cdot\sum_{k=1}^ns_ku_k, $$ which is wrong in general, as nearly any example shows. Partial results when $n\geqslant2$ are that, if $(s_k)$ and $(u_k)$ are both increasing or both decreasing, then $$ \sum_{k=1}^ns_k\cdot\sum_{k=1}^nu_k\lt n\cdot\sum_{k=1}^ns_ku_k, $$ and that, if $(s_k)$ is increasing and $(u_k)$ is decreasing, or the other way round, then $$ \sum_{k=1}^ns_k\cdot\sum_{k=1}^nu_k\gt n\cdot\sum_{k=1}^ns_ku_k. $$