Generalization of Geometric Mean, Standard Deviation, etc.

35 Views Asked by At

The geometric mean can be thought of as the exponential of the arithmetic mean of the logarithms of your dataset $\{a_i\}_{i=1}^n$: $$GM = \exp\left(\dfrac{1}{n}\sum_{i=1}^{n}\ln(a_i)\right)$$

Similarly, the standard deviation of a dataset is the square root of the arithmetic mean of the squares of your errors $\{e_i\}_{i=1}^{n}$: $$SD = \sqrt{\dfrac{1}{n}\sum_{i=1}^{n}(e_i)^2}$$

I find it interesting that both of these ideas seems to follow of more general pattern: $$f^{-1}\left( \dfrac{1}{n} \sum_{i=1}^{n} f(x_i) \right),$$ where the function in question is $f(x) = x^2$ for the standard deviation, and $f(x) = \ln x$ for the geometric mean. Even the arithmetic mean is trivially of this form, just with $f(x)= x$ as the identity function.


Are there other widely-used variations of these kinds of "functional" averages? And is there anything we can say, more universally, about these kinds of averages as a whole?

In particular, I find it interesting that all three of the averages I mentioned above (when applied to two values) all give values that are between the two data points. What would have to be true about a function $f(x)$ for this "midpoint" property to hold? For example, $\arcsin\left( \frac{1}{2} (\sin a + \sin b)\right)$ is most certainly not between $a$ and $b$ in most cases.

1

There are 1 best solutions below

1
On BEST ANSWER

This type of average is called a quasi-arithmetic mean when $f$ is continuous. The midpoint property is then guaranteed because $f$ must be strictly monotonic if it is continuous and has a left inverse $f^{-1}$.

As you observed, choosing $f(x)=\ln(x)$ (or $f(x)=\log_a(x)$ for any positive $a \neq 1$) results in the geometric mean. Another notable example is that $f(x)=\frac{1}{x}$ corresponds to the harmonic mean.

For any such $f$, the $f$-mean $M_f(\vec{x})=f^{-1}\left(\frac{1}{n} \sum_{i=1}^n f(x_i)\right)$ enjoys many other properties that would be expected of an averaging function (see the Wikipedia page). It is easy to show, for instance, that the $f$-mean has idempotency in that for all $x$, $M_f(x,\cdots,x)=x$.