Special case of variance decomposion formula

331 Views Asked by At

The end of the preamble of the wikipedia page for the law of total variance provides the following formula for the variance of $X$ where $A_1,A_2,\ldots,A_n$ is the partition of the outcome space (i.e., events $A_1,A_2,\ldots,A_n$ are mutually exclusive and exhastive):

$$\tag{1}\operatorname{Var}(X)=\sum_{i=1}^n\operatorname{Var}(X|A_i)\operatorname{P}(A_i)-2\sum_{i=1}^n\sum_{j=1}^{i-1}\operatorname{E}(X|A_i)\operatorname{P}(A_i)\operatorname{E}(X|A_j)\operatorname{P}(A_j)$$

This is given without proof. I don't quite see how it follows from the variance decomposition formula copied from the same article with variables $X$ and $Y$ respectively renamed to $A$ and $X$ to make them correspond to equation (1):

$$\operatorname{Var}(X)=\operatorname{E}_A(\operatorname{Var}(X|A))+\operatorname{Var}_A(\operatorname{E}(X|A))\tag{2}$$

I think that $\operatorname{E}_A(\operatorname{Var}(X|A))=\sum_{i=1}^n\operatorname{Var}(X|A_i)\operatorname{P}(A_i)$, matching the first of two terms on the RHS of (1) and (2), but, while in (1) the second term might be negative (e.g., when $\operatorname{E}(X|A_i)>0$ for all $i=1,2,\ldots,n$), the second term in (2) is necessarily positive (since variance cannot be negative). Thus, I am confused.

Perhaps I am missing something. Could anyone clarify and/or provide a reference with derivation of (1)?

1

There are 1 best solutions below

6
On BEST ANSWER

Let the sample space be denoted as $\Omega$. $$ \begin{eqnarray} \mathbb{V}\left(X\right){}={}\mathbb{V}\left(X{\bf{1}}_{\Omega}\right)&{}={}&\mathbb{V}\left(X\sum\limits_{i}{\bf{1}}_{A_i}\right){}={}\mathbb{V}\left(\sum\limits_{i}X{\bf{1}}_{A_i}\right)\newline &&\newline &{}={}&\sum\limits_{i}\mathbb{V}\left(X{\bf{1}}_{A_i}\right){}+{}2\sum\limits_{i<j}\mathbb{C}ov\left(X{\bf{1}}_{A_i},X{\bf{1}}_{A_j}\right)\newline &&\newline &{}={}&\sum\limits_{i}\mathbb{V}\left(X{\bf{1}}_{A_i}\right){}-{}2\sum\limits_{i<j}\mathbb{E}\left[X{\bf{1}}_{A_i}\right]\mathbb{E}\left[X{\bf{1}}_{A_j}\right]\newline &&\newline &{}={}&\sum\limits_{i}\mathbb{V}\left(X\vert A_i\right)P(A_i){}-{}2\sum\limits_{i<j}\mathbb{E}\left[X\vert A_i\right]P(A_i)\mathbb{E}\left[X\vert A_j\right]P(A_j)\,. \end{eqnarray} $$