Law of Total Probability and Variance

140 Views Asked by At

Assume that $X$ is a discrete random variable with support $\left\{x_{1},x_{2},...,x_{n} \right\}$. Consider $\Pr \left[ f \left( X \right) <z \right] $. According to the Law of Total Probability, $\Pr \left[ f \left( X \right) <z \right]=E\left[ \Pr \left[f\left( X \right) <z|X \right] \right]= \sum_{i=1}^{i=n} \Pr \left[ f\left( X \right) <z|X=x_{i} \right] \Pr\left[X=x_{i} \right].$

The question is the following: can we plug $x_i$ directly into $f(X)$, to get $\sum_{i=1}^{i=n} \Pr \left[ f\left( x_i \right) <z|X=x_{i} \right] \Pr\left[X=x_{i} \right]$?

If the function $f(X)=X$ then the equations above work just fine. However, if $f(X)=var(X)$ then we end up with nonsense. Indeed, $\Pr \left[ var \left( X \right) <z \right]=\sum_{i=1}^{i=n} \Pr \left[ var\left( X \right) <z|X=x_{i} \right] \cdot \Pr\left[X=x_{i} \right]=\sum_{i=1}^{i=n} \Pr \left[ var\left( x_{i} \right) <z|X=x_{i} \right] \cdot \Pr\left[X=x_{i} \right] =\sum_{i=1}^{i=n} \Pr \left[ 0 <z|X=x_{i} \right] \cdot \Pr\left[X=x_{i} \right]=\sum_{i=1}^{i=n} 1 \cdot \Pr\left[X=x_{i} \right]=1, $ if $z \geq 0$ or, $=0,$ if $ z <0$

So why the Law of Total Probability ( or its version) does not work in the case of variance?

1

There are 1 best solutions below

9
On

Your mistake is in saying that on the event $\{X=x_i\}$ we have $\text{Var}(X)=\text{Var}(x_i)$. That is not the case. $\text{Var}(X)$ is simply a constant.

A random variable $X$ can be described as a function. Here, perhaps $X$ is a function from $\{1,2,\dots, n\}$ with $X(i)=x_i$. This is a deterministic function, but we think about the input as being random. So the probability that $X$ takes on the value $x_i$ is the probability that $i$ is the input to the function $X$.

Now when we write $f(X)$ this is really a composition of functions. I.e., on input $i$ we have $f(X)=f\circ X(i)=f(x_i)$. So $f(X)$ is itself a random variable, that is, another function with domain $\{1,2,\dots,n\}$. When we apply $f$ to the random variable $X$ we get out another random variable $f(X)$. Keep in mind that this $f$ must have domain containing the range of the random variable $X$ (in this case, $\{x_1,\dots,x_n\}$).

However, when we write $\text{Var}(X)$, this is not a composition of functions. There is not input to $\text{Var}(X)$. Rather, $\text{Var}$ is a function from the set of random variables (with finite second moment) to $\mathbb{R}$. So $\text{Var}$ takes in a random variable $X$ and spits out a number $\text{Var}(X)$. Notice that $\text{Var}$'s domain does not contain the range of $X$, since $\text{Var}$ takes in random variables, not numbers. So we cannot say $f=\text{Var}$ and apply the formula you derived for $f$.

The reason this may be confusing is because we don't like to write random variables as functions, since it's easier to just think of them as numbers that take on random values. Hence the notation gets overload, so that $f(X)$ denotes a random variable while $\text{Var}(X)$ denotes a number.