Out of $n$ individual voters at an election, $r$ vote Republican and $n-r$ vote Democrat. At the next election, the probability of a Republican switching to vote Democrat is $p_1$, and a Democrat switching to vote Republican is $p_2$ Suppose individuals behave independently. Find the $Var(X)$ where $X$ = # of Republican votes at the second election.
Let $I_i$ be the indicator that the $i$th person votes for Republican, $1 \le i \le n$
$Var(X)$
$=Var(I_1 + I_2 + \ldots + I_n)$
$=Var(I_1) + Var(I_2) + \ldots + Var(I_n)$ since individuals behave independently.
$=\sum_{i=1}^{n}Var(I_i)$
$E(I_i) = P(I_i)$
$= P($Voted for Republican$)P($Not Switching | Voted for Republican$) $
$+ P($Voted for Democrat$)P($Switching | Voted for Democrat$)$
$= \frac{r}{n} \cdot (1-p_1) + \frac{n-r}{n} \cdot p_2$
$= \frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}$
$Var(I_i)$
$=E(I_i^2) - (E(I_i))^2$
$= (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2$
We have:
$Var(X)$
$=\sum_{i=1}^{n}Var(I_i)$
$=\sum_{i=1}^{n}\left((\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2\right)$
$=n\left((\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2\right)$
Textbook Answer:
$r(1-p_1)p_1+(n-r)p_2(1-p_2)$
I checked the 2 solutions with $p_1 = 0.3, p_2 = 0.4, n = 8, r= 4$ and my solution gave 1.98, but the textbook solution gave 1.8.
Not sure where went wrong because I used the same method for another question and got it right.
Edit 1:
Let $I_i$ be the indicator that the $i$th person votes for Republican, $1 \le i \le n$
$Var(X)$
$=Var(I_1 + I_2 + \ldots + I_n)$
$P(I_i)$ changes depending on whether the voter is a Republican or a Democrat. And the probability whether they are a Republican or a Democrat depends on the # of Republican/Democrat voters that have voted. Thus, $I_i$ are dependent of each other so $Var(I_1 + I_2 + \ldots + I_n)$ cannot be broken down using the addition rule for variances.
New Plan:
Let $X_1$ be the number of Republicans who remains voting for Republican.
Let $X_2$ be the number of Democrats who switch to vote for Republican.
$Var(X)$
$= Var(X_1 + X_2)$, $X_1$ and $X_2$ are independent because the choices of republican voters do not affect the choices of the democrat voters.
$= Var(X_1) + Var(X_2)$
Let $I_i$ be the indicator that the $i$th Republican voter votes for Republican, $1 \le i \le r$
$Var(X_1)$
$=Var(I_1 + I_2 + ... + I_r)$, this time, $I_i$ is independent of each other because $P(I_i)$ probability of a given Republican voter voting for Republican only depends on the probability of a Republican voter not switching, which is constant $(1-p_1)$.
$=Var(I_1) + Var(I_2) + ... + Var(I_r)$
$= \sum_{i = 1}^{r}Var(I_i)$
$= \sum_{i = 1}^{r}((1-p_1)-(1-p_1)^2)$
$= \sum_{i = 1}^{r}(1-p_1)(1-(1-p_1))$
$= \sum_{i = 1}^{r}(1-p_1)p_1$
$= r(1-p_1)p_1$
Same is done for $Var(X_2)$ except instead of "$r$", we have $n-r$ Democrats and we get:
$Var(X)$
$= Var(X_1) + Var(X_2)$
$= r(1-p_1)p_1 + (n-r)(1-p_2)p_2$
Note that we do not have $\operatorname{Var}(I_1)=\operatorname{Var}(I_n)$ in general.
Suppose the first person voted republican previously, then his current distribution is $Bernoulli(p_1)$
Suppose the $n$-th person voted democrats previously, then his current distribution is $Bernoulli(p_2)$.
We can't assume that $\operatorname{Var}(I_1)=\operatorname{Var}(I_2)$.
There are $r$ people who voted for republican previously is a known fact and it is not the case that everyone equally likely voted for republican with probability $\frac{r}{n}$ previously.
We can assume the first $r$ people voted for republican previously and the remaining $n-r$ people voted for democrats previously.
Guide:
Let $X_1$ be the number of republican who remains voting for republican.
Let $X_2$ be the number of democrats who switch to vote for republican.
Let $X=X_1+X_2$, note that $X_1$ and $X_2$ are independent. Think of what distribution does $X_i$ follows.