Voting Question: Find Variance

114 Views Asked by At

Out of $n$ individual voters at an election, $r$ vote Republican and $n-r$ vote Democrat. At the next election, the probability of a Republican switching to vote Democrat is $p_1$, and a Democrat switching to vote Republican is $p_2$ Suppose individuals behave independently. Find the $Var(X)$ where $X$ = # of Republican votes at the second election.

Let $I_i$ be the indicator that the $i$th person votes for Republican, $1 \le i \le n$

$Var(X)$

$=Var(I_1 + I_2 + \ldots + I_n)$

$=Var(I_1) + Var(I_2) + \ldots + Var(I_n)$ since individuals behave independently.

$=\sum_{i=1}^{n}Var(I_i)$


$E(I_i) = P(I_i)$

$= P($Voted for Republican$)P($Not Switching | Voted for Republican$) $

$+ P($Voted for Democrat$)P($Switching | Voted for Democrat$)$

$= \frac{r}{n} \cdot (1-p_1) + \frac{n-r}{n} \cdot p_2$

$= \frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}$


$Var(I_i)$

$=E(I_i^2) - (E(I_i))^2$

$= (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2$


We have:

$Var(X)$

$=\sum_{i=1}^{n}Var(I_i)$

$=\sum_{i=1}^{n}\left((\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2\right)$

$=n\left((\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n}) - (\frac{r(1-p_1)}{n} + \frac{(n-r)p_2}{n})^2\right)$

Textbook Answer:

$r(1-p_1)p_1+(n-r)p_2(1-p_2)$

I checked the 2 solutions with $p_1 = 0.3, p_2 = 0.4, n = 8, r= 4$ and my solution gave 1.98, but the textbook solution gave 1.8.

Not sure where went wrong because I used the same method for another question and got it right.


Edit 1:

Let $I_i$ be the indicator that the $i$th person votes for Republican, $1 \le i \le n$

$Var(X)$

$=Var(I_1 + I_2 + \ldots + I_n)$

$P(I_i)$ changes depending on whether the voter is a Republican or a Democrat. And the probability whether they are a Republican or a Democrat depends on the # of Republican/Democrat voters that have voted. Thus, $I_i$ are dependent of each other so $Var(I_1 + I_2 + \ldots + I_n)$ cannot be broken down using the addition rule for variances.


New Plan:

Let $X_1$ be the number of Republicans who remains voting for Republican.

Let $X_2$ be the number of Democrats who switch to vote for Republican.

$Var(X)$

$= Var(X_1 + X_2)$, $X_1$ and $X_2$ are independent because the choices of republican voters do not affect the choices of the democrat voters.

$= Var(X_1) + Var(X_2)$


Let $I_i$ be the indicator that the $i$th Republican voter votes for Republican, $1 \le i \le r$

$Var(X_1)$

$=Var(I_1 + I_2 + ... + I_r)$, this time, $I_i$ is independent of each other because $P(I_i)$ probability of a given Republican voter voting for Republican only depends on the probability of a Republican voter not switching, which is constant $(1-p_1)$.

$=Var(I_1) + Var(I_2) + ... + Var(I_r)$

$= \sum_{i = 1}^{r}Var(I_i)$

$= \sum_{i = 1}^{r}((1-p_1)-(1-p_1)^2)$

$= \sum_{i = 1}^{r}(1-p_1)(1-(1-p_1))$

$= \sum_{i = 1}^{r}(1-p_1)p_1$

$= r(1-p_1)p_1$


Same is done for $Var(X_2)$ except instead of "$r$", we have $n-r$ Democrats and we get:

$Var(X)$

$= Var(X_1) + Var(X_2)$

$= r(1-p_1)p_1 + (n-r)(1-p_2)p_2$

2

There are 2 best solutions below

3
On BEST ANSWER

Note that we do not have $\operatorname{Var}(I_1)=\operatorname{Var}(I_n)$ in general.

Suppose the first person voted republican previously, then his current distribution is $Bernoulli(p_1)$

Suppose the $n$-th person voted democrats previously, then his current distribution is $Bernoulli(p_2)$.

We can't assume that $\operatorname{Var}(I_1)=\operatorname{Var}(I_2)$.

There are $r$ people who voted for republican previously is a known fact and it is not the case that everyone equally likely voted for republican with probability $\frac{r}{n}$ previously.

We can assume the first $r$ people voted for republican previously and the remaining $n-r$ people voted for democrats previously.

Guide:

Let $X_1$ be the number of republican who remains voting for republican.

Let $X_2$ be the number of democrats who switch to vote for republican.

Let $X=X_1+X_2$, note that $X_1$ and $X_2$ are independent. Think of what distribution does $X_i$ follows.

0
On

You are assuming the probability that any two voters were former whatever voters is independent.   That's not the case since there is a fix amount for each in the population (cf: drawing without replacement).

$X$ equals the count of stay voters among the $r$ former Republican voters (call that $Y$), plus the count of switch voters among the $n-r$ former Democrats voters (call that count $Z$).

Then just use bilinearity of (co)variance:$$\mathsf{Var}(X)~{=\mathsf {Var}(Y)+\mathsf {Var}(Z)+2\mathsf{Cov}(Y,Z)\\=\mathsf {Var}(Y)+\mathsf {Var}(Z)\qquad+0}$$

Now, you may use Indicator random variables to find the variance of these counts seperately from first principles.

Alternatively, we can line everyone up so the first $r$ voters were those who voted Republican, while the last $n-r$ voted Democrate.

$$\mathsf E(I_i^2)=\mathsf E(I_i) =\begin{cases} 1-p_1&:& 1\leq i\leq r\\ p_2 &:& r+1\leq i\leq n\end{cases}$$


However, you should recognise the counts as being binomial random variables, and immediately know how to find their variance.

$Y\sim\mathcal{Binom}(r,1-p_1), Z\sim\mathcal{Binom}(n-r,p_2)$ and so $${\mathsf{Var}(Y)=r(1-p_1)p_1 \\ \mathsf{Var}(Z)=(n-r)p_2(1-p_2)}$$


Though working from first principles is not wrong.