Covariance of Bernoulli random variables

434 Views Asked by At

Let $X_1,X_2,X_3,X_4\in\{-1,1\}$ be four independent Bernoulli random variables satisfying $\Pr[X_i=-1]=p=1-\Pr[X_i=1]$.

It is quite straightforward to calculate $\mathbb{E}[X_iX_j]=(1-2p)^2+\delta_{ij}4p(1-p)$, where $\delta_{ij}=1$ if $i=j$ and $0$ otherwise.

I would like to calculate $C=\mathbb{E}[X_iX_jX_kX_\ell]-\mathbb{E}[X_iX_j]\mathbb{E}[X_kX_\ell]$ for $i,j,k,\ell\in\{1,2,3,4\}$. If $i=j$ or $k=\ell$, then $C=0$. If $i\neq j$ and $k\neq\ell$, I have to enumerate various cases such as $i=k$, $i\neq k$, etc, but there are quite a few cases to consider. Is there an easy way to calculate $C$?

1

There are 1 best solutions below

2
On BEST ANSWER

To compute $C$, we study the dependency structure between $X_i X_j$ and $X_k X_l$ using a graphical representation. Let $\mathsf{G}$ be the undirected graph with the vertex set $\mathsf{V} = \{i, j, k, l\}$ and edge set $\mathsf{E} = \{ \{i, j\}, \{k, l\}\}$. Then

\begin{align*} C &= \prod_{v \in \mathsf{V}} \mathbf{E}[X_v^{\deg(v)}] - \prod_{\{e_1, e_2\} \in \mathsf{E}} \mathbf{E}[X_{e_1}X_{e_2}] \\ &= \prod_{v \in \mathsf{V}} (1 - 2p)^{\mathbf{1}[\text{$\deg(v)$ is odd}]} - \prod_{e \in \mathsf{E}} (1 - 2p)^{2\cdot\mathbf{1}[\text{$e$ is not a loop}]} \\ &= (1 - 2p)^{[\text{# of odd-degree vertices}]} - (1 - 2p)^{2\cdot[\text{# of non-loop edges}]} \end{align*}

where $\deg(v)$ is the degree of the vertex $v$ and $\mathbf{1}[\ldots]$ is the indicator function notation. This shows that the value of $C$ depends only on the graph structure of $\mathsf{G}$, and so, it suffices to enumerate all the possibilities of $\mathsf{G}$, compute the corresponding $C$, and if necessary, find the conditions for $i, j, k, l$ realizing $\mathsf{G}$:

$\qquad\mathsf{G}\qquad$ $C$ (where $\mu = 1 - 2p$) Relations among $i, j, k, l$
graph 1 $C = \mu^0 - \mu^{2\cdot0} = 0 $ $i = j = k = l$
graph 2 $C = \mu^0 - \mu^{2\cdot 0} = 0 $ $ i = j \neq k = l $
graph 3 $C = \mu^2 - \mu^{2\cdot 1} = 0$ $\begin{cases} i = j = k \neq l, \text{ or} \\ i = j = l \neq k, \text{ or} \\ i = k = l \neq k, \text{ or} \\ j = k = l \neq i \end{cases}$
graph 4 $C = \mu^0 - \mu^{2\cdot 2} = 1 - \mu^4 $ $\begin{cases} i = k \neq j = l, \text{ or} \\ i = l \neq j = k \end{cases}$
graph 5 $C = \mu^2 - \mu^{2\cdot 1} = 0 $ $\begin{cases} i = j, k, l \text{ distinct, or} \\ i, j, k = l \text{ distinct} \end{cases}$
graph 6 $C = \mu^2 - \mu^{2\cdot 2} = \mu^2(1 - \mu^2) $ $\begin{cases} i = k, j, l \text{ distinct, or} \\ i = l, j, k \text{ distinct, or} \\ j = k, i, l \text{ distinct, or} \\ j = l, i, k \text{ distinct} \\ \end{cases}$
graph 7 $C = \mu^4 - \mu^{2\cdot 2} = 0 $ $i, j, k, l \text{ all distinct}$

Remarks.

  1. In each of the cases 2, 5, 7, the two edges have no common vertices at all, implying that $X_i X_j$ and $X_k X_l$ are independent. This provides another reason why we have $C = 0$.

  2. Also note that $C = 0$ if $\mathsf{G}$ contains loops. Indeed, if $i = j$ so that the edge $\{i, j\}$ is a loop, then $X_i X_j = 1$ and hence $C = \mathbf{Cov}(1, X_k X_l) = 0$. This gives another explanation as to why we get $C = 0$ in each of the cases 1, 2, 3, 5.