Why can indicator variables be used this way?

119 Views Asked by At

Each day for six days Rachel uses one of her six snowboards, chosen at random. At the end of the week, what is the expected number of snowboards that were used?

I understand that we can use an indicator variable over the snowboards ($X_i$, for $1\leq i\leq 6$, is $1$ if snowboard $i$ was used in the six days and $0$ otherwise) to compute the answer: $$\sum_{i=1}^6E[X_i]=6\cdot\left(1-\left(\frac{5}{6}\right)^6\right)=\frac{31031}{7776}.$$

Today, my teacher also claimed that we can use an indicator variable over the days ($Y_i$, for $1\leq i\leq 6$, is $1$ if Rachel used a new snowboard on day $i$, and $0$ if she reused an already-used snowboard on that day). He said that $$P(Y_i=1)=\left(\frac{5}{6}\right)^{i-1}$$ since "on each of the $i-1$ days, Rachel had a $5/6$ chance of picking a different board independently." This method yields a solution of: $$\sum_{i=1}^6E[Y_i]=1+\frac{5}{6}+\dots+\left(\frac{5}{6}\right)^5=\frac{31031}{7776},$$ in agreement with before.

Why does this second method work? In particular, why does Rachel have an independent probability of $5/6$ to pick a different board on each of the $i-1$ days?

2

There are 2 best solutions below

1
On

First of check whether we accept the sum of the indicators $$ \sum_{i=1}^6 Y_i $$ represent the unique number of snowboards that are used.

Next, note that those $Y_i$ are not independent random variables. But even for dependent random variables, the linearity of expectation still holds, provided that the expectation exist:

$$ E\left[\sum_{i=1}^6 Y_i\right] = \sum_{i=1}^6 E[Y_i] $$

From the definition of $Y_i$, it looks like they are in a chronological order. So when we are talking about the marginal distribution of $Y_i$, people are easily confused with the conditional distribution of $Y_i|Y_1, Y_2, \ldots, Y_{i-1}$. We will compute the pmf and the conditional pmf in below to illustrate this.

Label the snowboards from $1$ to $6$. Denote $Z_i$ be the snowboard number randomly chosen at day $i$, $i = 1, 2, \ldots 6$. By assumption, since the choice in each day are independent, and all of them are chosen at random, choices are equally-likely, therefore $Z_i$ are i.i.d. with the common uniform discrete distribution $$ \Pr\{Z_i = z\} = \frac {1} {6}, z = 1, 2, \ldots 6, i = 1, 2, \ldots 6 $$

By definition, $Y_1 = 1$ almost surely because it is always a new snowboard at day $1$. So it is just a constant, independent to any other random variables and we do not need to worry about it.

And $$ Y_2 = \begin{cases} 1 & \text{if} & Z_2 \neq Z_1 \\ 0 & \text{if} & Z_2 = Z_1 \end{cases}$$

So we have $$ \begin{align} \Pr\{Y_2 = 0\} &= \Pr\{Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_2 = Z_1 | Z_1 = z\} \Pr\{Z_1 = z\} \\ &= \sum_{z=1}^6 \Pr\{Z_2 = z\} \Pr\{Z_1 = z\} \\ &= \sum_{z=1}^6 \frac {1} {6} \times \frac {1} {6} \\ &= \frac {6} {36} \\ & = \frac {1} {6} \end{align}$$

and thus $$ \Pr\{Y_2 = 1\} = 1 - \frac {1} {6} = \frac {5} {6} $$

Similarly, $$ Y_3 = \begin{cases} 1 & \text{if } Z_3 \neq Z_1, Z_3 \neq Z_2\\ 0 & \text{otherwise} \end{cases}$$

So $$ \begin{align} \Pr\{Y_3 = 1\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_3 = z\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \Pr\{z \neq Z_1, z \neq Z_2\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \Pr\{Z_1 \neq z_3\}\Pr\{Z_2 \neq z\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \left(1 - \frac {1} {6}\right) \left(1 - \frac {1} {6}\right)\frac {1} {6} \\ &= \left(\frac {5} {6}\right)^{3-1} \end{align} $$

One key observation here is that $\{Z_3 \neq Z_1\}, \{Z_3 \neq Z_2\}$ are not independent events in general. But conditional on $Z_3 = z$, they are independent events. The same argument can be generalized and thus we have the given result.

Below we consider the conditional distribution of $Y_3|Y_2$, and try to illustrate the difference between the conditional distribution and the marginal distribution.

$$ \begin{align} \Pr\{Y_3 = 1|Y_2 = 0\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_2 = Z_1\} \\ &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_1 | Z_2 = Z_1\} \\ &= \Pr\{Z_3 \neq Z_1 | Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq Z_1 | Z_2 = Z_1, Z_1 = z\}\Pr\{Z_1 = z | Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z | Z_2 = z, Z_1 = z\} \frac {\Pr\{Z_1 = z, Z_2 = Z_1\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z \} \frac {\Pr\{Z_1 = z, Z_2 = z\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z \} \frac {\Pr\{Z_1 = z\}\Pr\{Z_2 = z\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \left(1 - \frac {1} {6}\right) \frac {(1/6)(1/6)} {1/6} \\ &= \frac {5} {6} \end{align}$$

Actually it just means that when $Z_1 = Z_2$, the choices in the first two days occupied the same snowboard only, therefore there are still $6 - 1 = 5$ remaining snowboards considered to be new for chosen in day $3$, and thus the conditional probability is just %5/6%.

$$ \begin{align} \Pr\{Y_3 = 1|Y_2 = 1\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_2 \neq Z_1\} \\ &= \frac {Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2, Z_2 \neq Z_1\}} {\Pr\{Z_2 \neq Z_1\}} \\ &= \frac { (6 \times 5 \times 4) / 6^3} {5/6} \\ &= \frac {4} {6} \end{align} $$

The probability on the numerator relies on counting the permutation of those $(Z_1, Z_2, Z_3)$, as they are distinct. This result also just tells the similar fact: when the choices in the first two days are different, $2$ snowboards are occupied and we only have $6 - 2 = 4$ new snowboards left on day $3$, so the probability is $4/6 = 2/3$.

It shows that $Y_3$ and $Y_2$ are not independent. Conditional on the information of $Y_2$ will give us different distribution of $Y_3$. However, if we consider the marginal distribution by using the law of total probability, we have

$$ \begin{align} \Pr\{Y_3 = 1\} &= \Pr\{Y_3 = 1|Y_2 = 0\}\Pr\{Y_2 = 0\} + \Pr\{Y_3 = 1|Y_2 = 1\}\Pr\{Y_2 = 1\} \\ &= \frac {5} {6} \times \frac {1} {6} + \frac {4} {6} \times \frac {5} {6} \\ &= \frac {5} {6} \times \left(\frac {1} {6} + \frac {4} {6}\right) \\ &= \left(\frac {5} {6}\right)^{3-1} \end{align}$$

So we can arrive at the same result. This is to illustrate that seemingly the value of $Y_3$ realized "after" $Y_2$, as they seemingly having a chronological order here, but when we consider the marginal distribution of $Y_3$, we do not conditional on the value of $Y_2$ - we simply ignore the result of $Y_2$, as if the results from the other days are hidden and we only observe day $3$.

Conditional on different values of $Y_2$ will lead to a different conditional distribution of $Y_3$. So one may consider the sequence $Y_1, Y_2, Y_3, \ldots $ as a random process - different $Y_2$ give a different sample path. When we calculate the marginal distribution of $Y_3$, actually we take the average from all the possible paths (all the possible scenarios in the parallel world) weighted by the probability of each path.

Once you can distinguish between the marginal distribution and the conditional distribution, then you should be able to accept the result.

1
On

There is a crucial assumption in how we read the problem:

Each day for six days Rachel uses one of her six snowboards, chosen at random.

We interpret this to mean that on any day, each snowboard has an equal chance to be chosen, regardless of which snowboards were chosen on previous days. Your teacher interprets the problem this way, I interpret it this way, and so do you in the first solution you presented.

For the second solution method, after defining the indicator variable $Y_i$ for $1\leq i\leq 6$ so that $Y_i=1$ if Rachel used a new snowboard on day $i$ and $Y_i=0$ if she reused a snowboard on day $i$, your teacher says that $P(Y_i=1)=\left(\frac{5}{6}\right)^{i-1}$ because

on each of the $i-1$ days, Rachel had a $5/6$ chance of picking a different board independently.

We might elaborate on this statement a little bit to remove ambiguity:

On each of the $i-1$ days prior to day $i$, Rachel had a $5/6$ chance of picking a board different from the one she eventually picked on day $i$, and these $5/6$ probabilities were independent.

Let's consider day $3$, for example. On day $1$, Rachel has a $5/6$ chance not to use board number $1$. Likewise, on day $2$ Rachel has a $5/6$ chance not to use board number $1$. Since the choices of board on day $1$ and day $2$ are independent, Rachel has a $\left(\frac56\right)^2$ chance not to pick board $1$ on either of the first two days.

For the same reason, Rachel has a $\left(\frac56\right)^2$ chance not to pick board $2$ on either of the first two days.

Same for board $3$, for board $4$, for board $5$, and for board $6$.

On day $3$, Rachel will choose one of the six boards. But as we've already seen, no matter which board it is, there is a $\left(\frac56\right)^2$ chance that Rachel did not pick that board on either of the first two days.

If neither of the choices Rachel made on the first two days was the same as the choice she made on day $3$, then she picked a new board on day $3$. Therefore $Y_3 = \left(\frac56\right)^2 = \left(\frac56\right)^{i-1}$ when $i=3$.

The other days work similarly. I chose $i=3$ as an example merely because it has enough prior days to be interesting but not too many, and because an example with a specific value of $i$ lets me list the "$i-1$ days" more explicitly than the general case allows.


Here's a third approach. Let $Z_i$ be an indicator variable indicating whether Rachel used a board for the last time on day $i$; that is, $Z_i=1$ if the board used on day $i$ was not used again, $Z_i=0$ if Rachel used the board again on any later day.

A board can be used for the last time only once, and any board that was used was used for the last time on one of the days, so the number of "last times" is the number of boards used: $$ \sum_{i=1}^6 Z_i = \sum_{i=1}^6 Y_i= \sum_{i=1}^6 X_i. $$

Since there are $6 - i$ days after day $i$, we can just make note of whatever board Rachel used on day $i$; on the next day she'll have a $5/6$ chance of using a different board, on the day after that a $5/6$ chance of a different board, and so forth. That is, whatever board Rachel picks on day $i$, with $\left(\frac56\right)^{6-i}$ probability she will not use it again. So

$$ P(Z_i = 1) = \left(\frac56\right)^{6-i} $$

and therefore

$$ \sum_{i=1}^6 E[Z_i] = \left(\frac56\right)^5 + \left(\frac56\right)^4 + \left(\frac56\right)^3 + \left(\frac56\right)^2 + \left(\frac56\right)^1 + \left(\frac56\right)^0 = \frac{31031}{7776}. $$