This past Wednesday, I had my stat class do the following exercise:
Roll a fair 6 sided dice 25 times. Take the sample mean of the face value. Using the standard deviation of the uniform probability distribution, construct confidence intervals with $C=.90, .95, .99$.
I had 16 groups total doing this exercise. All 16 groups found their 90% confidence intervals containing the the mean of the probability distribution (3.5.).
I thought I would have at least one confidence intervals that wouldn't contain the mean... considering that the probability that all 16 confidence intervals contains $\mu$ would be $.90^{16} \simeq .1853$.
Is my prediction completely flawed? Or was I just unlucky? Or is there something inherently wrong with my experiment?
This seems an excellent classroom experiment to reinforce the practical meaning of the confidence level of an interval estimate, and I wish more instructors would do this kind of thing.
First, let's just check to make sure we're on the same page about the experiment. Each of the 16 groups rolls a fair die $n = 25$ times. The sum of the 25 values is divided by 25 to get $\bar X$.
For a single roll of a die, we get the value $X_i$ which has $\mu = E(X_i) = 3.5$ and $\sigma^2 = V(X_i) = 35/12$. Thus $V(\bar X) = \sigma^2/n = 35/300 = 7/60.$ Then you assume $\bar X$ is close enough to normal to use the 90% z-interval $\bar X \pm 1.645\sigma/\sqrt{n}$ or $\bar X \pm 0.5619.$
You have 16 of these 90% CIs, and consider each of them to be a Success if it includes 3.5 and Failure if it does not. You were surprised to get no Failures because you think the chances of that are about 18% or 19%. (Just to make sure I'm not having a 'senior moment' here, I did a simulation in R and with a million runs of what you did, and I got around 18.5% of such 25-die experiments with 16 groups got no Failures.)
If that is the scenario, then you were moderately unlucky. But you might also have considered yourself unlucky if you had gotten less than 13 successes (even a little more likely). I suppose you would have been very pleased with 14 or 15 Successes (the most likely two results, to be sure), but the probability of that is only about 60%. And the THIRD most likely result is 16.
Maybe you can show your class a bar chart of the distribution BINOM(16, .9) and have a 'teachable moment' about variability.