Computing the covariance of flipping a coin in the first and last flips.

3.3k Views Asked by At

A fair coin is flipped $30$ times. Let $X$ denote the number heads among the first $20$ coin flips and Y denote the number of heads among the last $20$ coin flips. Compute the correlation coefficient of $X$ and $Y$

I have set the indicator, $I_{i}$ as the $i^{th}$ pick being heads. Then... $$ X=I_{1}+...+I_{20}\\ Y=I_{10}+...+I_{30} $$ Which yields $$Cov(X,Y)=\sum_{j=1}^{20}\sum_{i=10}^{30}Cov(I_{i},I_{j})=2Cov(I_{i},I_{j})=2E[I_{1},I_{2}]-2E[I_{1}]E[I_{2}] $$ Since the events are exchangeable. But I am stuck with $$P(\text{First two flips are heads})-P(\text{First flip is heads})^2=0 $$ Which obviously isn't the case.

3

There are 3 best solutions below

0
On BEST ANSWER

I will let $J_1 = I_1+\dots+I_{10},J_2 = I_{11}+\dots+I_{20},J_3 = I_{21}+\dots+I_{30}$. Then \begin{align*} \text{Corr}(X,Y) &= \frac{\text{Cov}(J_1+J_2,J_2+J_3)}{\text{SD}(X)\text{SD}(Y)}\\ & = \frac{\text{Cov}(J_1,J_2)+\text{Var}(J_2)+\text{Cov}(J_1,J_3)+\text{Cov}(J_2,J_3)}{\sqrt{20(1/4)\cdot 20(1/4)}}\tag{1}\\ &=\frac{10\cdot (1/4)}{20\cdot (1/4)}\\ &=\frac{1}{2} \end{align*} where in $(1)$ the covariances are zero since each block is disjoint and hence the number of heads in each block is independent.

0
On

We use your notation. Apart from standard formulas, all we need is $E(XY)$.

Let $U$ be the sum of the first $10$ $I_j$, $V$ the sum of the next $10$, and $W$ the sum of the last $10$. We want $E((U+V)(V+W))$, which is $$E(UV)+E(UW)+E(VW)+E(V^2).$$ The first three items are easy by independence. And $E(V^2)=\text{Var}(V)+(E(V))^2$.

0
On

They are not just exchangeble. Some of the indicators are for the exact same coin; the rest will be independent. That's the reason others have partitioned $X,Y$ into three blocks. Using your notation, we just separate the sum.

$\begin{align} \mathsf {Cov}(X,Y) & = \mathsf {Cov}(\sum_{i=1}^{20}I_i\sum_{j=11}^{30}I_j) \\[1ex] & = \sum_{i=1}^{20}\sum_{j=11}^{30}\mathsf {Cov}(I_i,I_j) \\[1ex] & = \sum_{i=11}^{20}\mathsf {Cov}(I_i,I_i) + \raise{1ex}\mathop{\sum_{i=1}^{20}\sum_{j=11}^{30}}_{i\neq j}\mathsf {Cov}(I_i,I_j) \\[1ex] & = \sum_{i=11}^{20}\Big(\mathsf {E}(I_i^2)-\mathsf E(I_i)^2\Big) +0 \\[1ex] & = \sum_{i=11}^{20}\Big(1^2\cdot\mathsf {P}(I_i=1)-1\cdot\mathsf P(I_i=1)^2\Big) \\[1ex] & = 10\Big(\tfrac 1 2-\tfrac 1 4\Big) \\[1ex] & = \tfrac {10} 4 \end{align}$

Why would you anticipate this to be zero?   $X$ and $Y$ are obviously going to have some linear relation; sharing ten component data points as they do.