Parity of the sum of consecutive Bernoulli random variables

308 Views Asked by At

$\newcommand{\Var}{\operatorname{Var}}$I have $X_1,X_2,\ldots,X_{n+1}$ i.i.d. rv, each $X_i$ is a Bernoulli rv with parameter $p$, i.e.

$X_i \in \{0,1\}$, $P(X_i=0)=1-p$ and $P(X_i=1)=p$ with $0 \leq p \leq 1$

I define the rvs $Y_i$ for $i=1,2,\ldots,n$ as

  • $Y_i=0$ if $X_i+X_{i+1}$ is even
  • $Y_i=1$ if $X_i+X_{i+1}$ is odd

then I define $S=Y_1+Y_2+\cdots+Y_n$

I have to calculate $E(S)$ and $\Var(S)$.

First I calculate the distribution of $Y_i$

\begin{align} P(Y_i=0) & =P(X_i=0,X_{i+1}=0) + P(X_i=1,X_{i+1}=1) \\[6pt] & =P(X_i=0)P(X_{i+1}=0)+P(X_i=1)P(X_{i+1}=1) \\[6pt] & =(1-p)^2+p^2 \end{align}

$P(Y_i=1)=1-P(Y_i=0)=2p(1-p)$

then $E(Y_i)=2p(1-p)$ and using the linearity of the mean

$E(S)=nE(Y_i)=2np(1-p)$

But for $\Var(S)$? I can't use that $\Var(S)=\Var(Y_1)+\Var(Y_2)+\cdots+\Var(Y_n)$ because the $Y_i$s are dipendent. Maybe could be useful that $Y_i=|X_i-X_{i+1}|$?

2

There are 2 best solutions below

3
On

By your result we have $Var(S) = E[S^2] - E[S]^2 = E[S^2] - 4 n^2 p^2 (1 - p)^2$. Now $$E[S^2] = E[Y_1^2 + \dots + Y_n^2 + 2 \sum_{1 \le i < j \le n} Y_iY_j].$$ By the distribution you calculated we have $E[Y_i^2] = 2p(1-p)$, so $$E[S^2] = 2 n p (1 - p) + 2 \sum_{1 \le i < j \le n} E[Y_iY_j].$$ Next notice that $Y_i$ and $Y_j$ are independent if $j \ge i+2$. In this case we have $E[Y_i Y_j] = E[Y_i] E[Y_j] = 4 p^2 (1-p)^2$. If $j = i+1$, then $Y_i$ and $Y_j$ are not independent, and we have to calculate the distribution of $Y_i Y_{i+1}$. We have $$P(Y_i Y_{i+1} = 1) = P(X_i = 0)P(X_{i+1} = 1)P(X_{i+2} = 0) + P(X_i = 1)P(X_{i+1} = 0)P(X_{i+2} = 1) = (1-p)^2 p + p^2 (1-p) = (1-p) p,$$ so $E[Y_i Y_{i+1}] = (1-p) p$. Thus $$E[S^2] = 2n p(1-p) + 2 \left( {n \choose 2} 4p^2 (1-p)^2 - (n-1) 4 p^2 (1-p)^2 + (n-1) (1-p) p\right),$$ which we can simplify to $$E[S^2] = 2(2n-1)p(1-p) + 4(n-1)(n-2)p^2(1-p)^2,$$ so the variance is $$Var[S] = p(1-p)(4(n-1)(n-2) p(1-p) + 2(2n-1) - 4n^2p(1-p)) = 2p(1-p)(2n - 1 - 2(3n-2)p(1-p)).$$

1
On

$\newcommand{\var}{\operatorname{var}}$ $\newcommand{\cov}{\operatorname{cov}}$ The variance of the sum is $$ \var(S) = \sum_i \var(Y_i) + \sum_{(i,j)\,:\,i\ne j} \cov(Y_i,Y_j) $$ (where the notation $(i,j)$ indicates that these are ordered rather than unordered pairs, so that, for example, the pairs $(1,2)$ and $(2,1)$ are distinct and correspond to two different terms in this sum.)

The variance of a Bernoulli random variable is given by $\var(Y) = (\mathbb E Y)(1-\mathbb E Y)$.

The covariance between two Bernoulli random variables is given by $\cov(Y_i,Y_j)$ $ = \mathbb E(Y_i Y_j) - (\mathbb E Y_i)(\mathbb E Y_j)$.

There are $n-1$ unordered pairs $i,j$, and hence $2(n-1)$ ordered pairs $(i,j)$, in which the covariance is not $0$, and the same covariance occurs in each of those, so the variance is just $n\var(Y_1)+2(n-1)\cov(Y_1,Y_2)$.

Notice that $\mathbb E(Y_1 Y_2)= \Pr(Y_1=Y_2=1)=\Pr((X_1,X_2,X_3)\in\{(0,1,0),(1,0,1)\})$.