Axiomatic approach to the definition of variance

318 Views Asked by At

I'm trying to grasp the intuition behind the definition of variance. It seems plausible that we want to measure how much a random variable deviates from it's expected value. But why using the square exactly?

From what I can see, we are interested in an assignment of the form $X\mapsto E(f(|E(X)-X|))$ for some strictly monotonous $f$ with $f(0)=0$ and $f(1)=1$. Are there any further properties of the variance from which, if used as axioms, we can derive $f(x)=x^2$?

For example, would additiveness w.r.t. independent random variables, i.e. $$E(f(|E(X+Y)-X-Y|))=E(f(|E(X)-X|))+E(f(|E(Y)-Y|))$$ for $X,Y$ independent, suffice as such an axiom?

2

There are 2 best solutions below

0
On BEST ANSWER

Yes, additivity for independent random variables does suffice.

To simplify matters a bit, we may assume $E[X] = 0$ and $E[Y]=0$.
Let's also assume $X$ and $Y$ are bounded, to avoid questions of existence of expected values. Also, since you only ever use $f$ on absolute values of random variables, we may define $f$ to be an even function on $\mathbb R$. I'll also assume $f$ is continuous. Now you want an even function $f$ such that $E[f(X+Y)] = E[f(X)] + E[f(Y)]$ for bounded independent random variables such that $E[X] = E[Y]=0$. By linearity of expectation, this is equivalent to $E[f(X+Y) - f(X) - f(Y)] = 0$.

In particular, for constants $s$ and $t$, consider independent $X$ and $Y$ such that $P(X=s)=P(X=-s)=1/2$ and $P(Y=t)=P(Y=-t)=1/2$. Then $E[f(X)] = (f(s) + f(-s))/2 = f(s)$, $E[f(Y)] = f(t)$ similarly, and $E[f(X+Y)] = (f(s+t) + f(s-t))/2$. Thus our equation becomes

$$ \dfrac{f(s+t) + f(s-t)}{2} - f(s) - f(t) = 0 $$

Note that for $s=t=0$ we get $f(0) = 0$. Now taking $s = k t$ for integers $k$, we can show by induction that $$ f(k t) = k^2 f(t) $$ and thus for rationals $a/b$, $$f\left(\frac{a}{b}\right) = a^2 f\left(\frac1b\right) = \frac{a^2}{b^2} f(1)$$ By continuity, we extend this to reals: $f(x) = x^2 f(1)$. If you assume the normalization $f(1) = 1$, you have $f(x) = x^2$.

I'm pretty sure that, as with the Cauchy functional equation, the assumption of continuity may be replaced by measurability (and we certainly need $f$ to be measurable, else $E[f(X)]$ would be undefined for, say, uniform random variables).

0
On

I started formulating this proof before I saw the recent version of Robert's answer, it's pretty much the same idea but I stil want to write it down.

Let $\Omega$ consist of $4$ elements with equal probabilities, I'll describe random variables over $\Omega$ just as $4$-tuples.

Now for $a\ge b\ge 0$ let $X=(a,a,-a,-a)$ and $Y=(b,-b,b,-b)$, those two are independent and we have $E(f(|X+Y|))=(1/2)f(a-b)+(1/2)f(a+b)$ such as $E(f(|X|))+E(f(|Y|))=f(a)+f(b)$. So if we know two values out of $f(a-b)$, $f(a)$ and $f(b)$ then the third one is uniquely determined by our axiom and the fact $E(X)=E(Y)=0$.

First we see that $f$ is determined on $2^{-n}$ by induction on $n$, for the induction step choose $a=b=2^{-n-1}$. Then, by induction on $m$, $f$ is also determined on $m2^{-n}$, for the induction step choose $a=m2^{-n},b=2^{-n}$.

So we know that $f$ is uniquely determined on a dense subset of ${\bf R}_{\ge 0}$ and since it must equal $x\mapsto x^2$ there, which is continuous on ${\bf R}_{\ge 0}$, we can derive all missing values exploiting $f$'s monotonicity.