Each day for six days Rachel uses one of her six snowboards, chosen at random. At the end of the week, what is the expected number of snowboards that were used?
I understand that we can use an indicator variable over the snowboards ($X_i$, for $1\leq i\leq 6$, is $1$ if snowboard $i$ was used in the six days and $0$ otherwise) to compute the answer: $$\sum_{i=1}^6E[X_i]=6\cdot\left(1-\left(\frac{5}{6}\right)^6\right)=\frac{31031}{7776}.$$
Today, my teacher also claimed that we can use an indicator variable over the days ($Y_i$, for $1\leq i\leq 6$, is $1$ if Rachel used a new snowboard on day $i$, and $0$ if she reused an already-used snowboard on that day). He said that $$P(Y_i=1)=\left(\frac{5}{6}\right)^{i-1}$$ since "on each of the $i-1$ days, Rachel had a $5/6$ chance of picking a different board independently." This method yields a solution of: $$\sum_{i=1}^6E[Y_i]=1+\frac{5}{6}+\dots+\left(\frac{5}{6}\right)^5=\frac{31031}{7776},$$ in agreement with before.
Why does this second method work? In particular, why does Rachel have an independent probability of $5/6$ to pick a different board on each of the $i-1$ days?
First of check whether we accept the sum of the indicators $$ \sum_{i=1}^6 Y_i $$ represent the unique number of snowboards that are used.
Next, note that those $Y_i$ are not independent random variables. But even for dependent random variables, the linearity of expectation still holds, provided that the expectation exist:
$$ E\left[\sum_{i=1}^6 Y_i\right] = \sum_{i=1}^6 E[Y_i] $$
From the definition of $Y_i$, it looks like they are in a chronological order. So when we are talking about the marginal distribution of $Y_i$, people are easily confused with the conditional distribution of $Y_i|Y_1, Y_2, \ldots, Y_{i-1}$. We will compute the pmf and the conditional pmf in below to illustrate this.
Label the snowboards from $1$ to $6$. Denote $Z_i$ be the snowboard number randomly chosen at day $i$, $i = 1, 2, \ldots 6$. By assumption, since the choice in each day are independent, and all of them are chosen at random, choices are equally-likely, therefore $Z_i$ are i.i.d. with the common uniform discrete distribution $$ \Pr\{Z_i = z\} = \frac {1} {6}, z = 1, 2, \ldots 6, i = 1, 2, \ldots 6 $$
By definition, $Y_1 = 1$ almost surely because it is always a new snowboard at day $1$. So it is just a constant, independent to any other random variables and we do not need to worry about it.
And $$ Y_2 = \begin{cases} 1 & \text{if} & Z_2 \neq Z_1 \\ 0 & \text{if} & Z_2 = Z_1 \end{cases}$$
So we have $$ \begin{align} \Pr\{Y_2 = 0\} &= \Pr\{Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_2 = Z_1 | Z_1 = z\} \Pr\{Z_1 = z\} \\ &= \sum_{z=1}^6 \Pr\{Z_2 = z\} \Pr\{Z_1 = z\} \\ &= \sum_{z=1}^6 \frac {1} {6} \times \frac {1} {6} \\ &= \frac {6} {36} \\ & = \frac {1} {6} \end{align}$$
and thus $$ \Pr\{Y_2 = 1\} = 1 - \frac {1} {6} = \frac {5} {6} $$
Similarly, $$ Y_3 = \begin{cases} 1 & \text{if } Z_3 \neq Z_1, Z_3 \neq Z_2\\ 0 & \text{otherwise} \end{cases}$$
So $$ \begin{align} \Pr\{Y_3 = 1\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_3 = z\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \Pr\{z \neq Z_1, z \neq Z_2\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \Pr\{Z_1 \neq z_3\}\Pr\{Z_2 \neq z\}\Pr\{Z_3 = z\} \\ &= \sum_{z=1}^6 \left(1 - \frac {1} {6}\right) \left(1 - \frac {1} {6}\right)\frac {1} {6} \\ &= \left(\frac {5} {6}\right)^{3-1} \end{align} $$
One key observation here is that $\{Z_3 \neq Z_1\}, \{Z_3 \neq Z_2\}$ are not independent events in general. But conditional on $Z_3 = z$, they are independent events. The same argument can be generalized and thus we have the given result.
Below we consider the conditional distribution of $Y_3|Y_2$, and try to illustrate the difference between the conditional distribution and the marginal distribution.
$$ \begin{align} \Pr\{Y_3 = 1|Y_2 = 0\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_2 = Z_1\} \\ &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_1 | Z_2 = Z_1\} \\ &= \Pr\{Z_3 \neq Z_1 | Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq Z_1 | Z_2 = Z_1, Z_1 = z\}\Pr\{Z_1 = z | Z_2 = Z_1\} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z | Z_2 = z, Z_1 = z\} \frac {\Pr\{Z_1 = z, Z_2 = Z_1\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z \} \frac {\Pr\{Z_1 = z, Z_2 = z\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \Pr\{Z_3 \neq z \} \frac {\Pr\{Z_1 = z\}\Pr\{Z_2 = z\}} {\Pr\{Z_2 = Z_1\}} \\ &= \sum_{z=1}^6 \left(1 - \frac {1} {6}\right) \frac {(1/6)(1/6)} {1/6} \\ &= \frac {5} {6} \end{align}$$
Actually it just means that when $Z_1 = Z_2$, the choices in the first two days occupied the same snowboard only, therefore there are still $6 - 1 = 5$ remaining snowboards considered to be new for chosen in day $3$, and thus the conditional probability is just %5/6%.
$$ \begin{align} \Pr\{Y_3 = 1|Y_2 = 1\} &= \Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2 | Z_2 \neq Z_1\} \\ &= \frac {Pr\{Z_3 \neq Z_1, Z_3 \neq Z_2, Z_2 \neq Z_1\}} {\Pr\{Z_2 \neq Z_1\}} \\ &= \frac { (6 \times 5 \times 4) / 6^3} {5/6} \\ &= \frac {4} {6} \end{align} $$
The probability on the numerator relies on counting the permutation of those $(Z_1, Z_2, Z_3)$, as they are distinct. This result also just tells the similar fact: when the choices in the first two days are different, $2$ snowboards are occupied and we only have $6 - 2 = 4$ new snowboards left on day $3$, so the probability is $4/6 = 2/3$.
It shows that $Y_3$ and $Y_2$ are not independent. Conditional on the information of $Y_2$ will give us different distribution of $Y_3$. However, if we consider the marginal distribution by using the law of total probability, we have
$$ \begin{align} \Pr\{Y_3 = 1\} &= \Pr\{Y_3 = 1|Y_2 = 0\}\Pr\{Y_2 = 0\} + \Pr\{Y_3 = 1|Y_2 = 1\}\Pr\{Y_2 = 1\} \\ &= \frac {5} {6} \times \frac {1} {6} + \frac {4} {6} \times \frac {5} {6} \\ &= \frac {5} {6} \times \left(\frac {1} {6} + \frac {4} {6}\right) \\ &= \left(\frac {5} {6}\right)^{3-1} \end{align}$$
So we can arrive at the same result. This is to illustrate that seemingly the value of $Y_3$ realized "after" $Y_2$, as they seemingly having a chronological order here, but when we consider the marginal distribution of $Y_3$, we do not conditional on the value of $Y_2$ - we simply ignore the result of $Y_2$, as if the results from the other days are hidden and we only observe day $3$.
Conditional on different values of $Y_2$ will lead to a different conditional distribution of $Y_3$. So one may consider the sequence $Y_1, Y_2, Y_3, \ldots $ as a random process - different $Y_2$ give a different sample path. When we calculate the marginal distribution of $Y_3$, actually we take the average from all the possible paths (all the possible scenarios in the parallel world) weighted by the probability of each path.
Once you can distinguish between the marginal distribution and the conditional distribution, then you should be able to accept the result.