I will roll a fair 100-sided die. If I get a 1 or 2, I will roll a 40-sided die.
How many times will I have to repeat this experiment until each roll on the 40-sided die appears at least once, on average?
The random generation is uniform, so the probability of getting a 1 or 2 from the 100-sided die is 1/50, and the probability of getting any single number from the 40-sided die is 1/40.
I don't know how to calculate this. I'll show what I was thinking.
Looking at just the 40-sided die, I will roll for a number. If I'm looking for a unique number, then I will always get a unique number on the first roll, so that has probability 1 (or 40/40). The next time, it will be 39/40. But then assuming that I don't get it, it's 39/40 again, but if I do, it's 38/40.
The probability seems to be (1 - q/40), where q is the number of numbers from 1 to 40 that I've already gotten at least once.
So first, there is a (40/40) chance that I get a unique roll.
Then, there is a (39/40) chance that I get one and (1/40) that I don't.
If I did get one there is a (38/40) chance I get one and (2/40) I don't, but if I didn't there is a (39/40) chance that I get one and (1/40) I don't, so in total by this point there should be (40/40)(39/40)(38/40) that I have 3 unique rolls, (40/40)(39/40)(2/40) that I have 2 unique rolls, and (40/40)(1/40) that I have 1 unique roll.
The pattern looks to be (40!)/(40^40) that I get it in 40 rolls and it's not possible to get it in under 40 rolls.
But I'm not sure how to calculate 41 rolls, and so on. I assume there is a final summation at the end but I don't know how to get the terms.
And then all of this is dependent on a (1/50) chance of getting the 1 or 2 from the 100-sided die. So that means the average is the answer multiplied by 50, is it not?
Let us focus on the second stage to begin with, ie the expected number of throws needed to collect all numbers from $1$ to $40$
Defining success as a new number appearing, the first number requires one throw. After that the number of trials until the second success is geometrically distributed with parameter $p=\frac{39}{40}$, thus mean $\frac{1}{p} = \frac{40}{39}$ and so on, giving an expected value for total throws for getting all $40$ numbers as $1+\frac{40}{39} +\frac{40}{38} + ... \frac{40}1$
Now coming to the first stage. If you have to clear it just once, add $50$ to the above.
On the other hand, if it has to be cleared each time, but the numbers already obtained in the second stage remain counted, multiply by $50$