Fundamentals of sampling without replacement.

59 Views Asked by At

I would like to ask a question about fundamentals that are used intuitively in solving simple probabilistic task about sampling with replacement. However, when I start to think about underlying fundamentals, things become complicated.

Consider a bucket with three balls: two red (R) and one black (B). We randomly select two balls. What is the probability to select two red balls? Additionally, assume that the sample space $S$ is: $\{~\{R, R\}, \{R, B\}, \{B, R\}~\}$.

When I was a student, I would say the following (and I assume that most of the people reason the same way). To select two balls, we first select the first ball and then the second one. The probability that the first ball is red ($P(\mathrm{Ball}_1 = R)$) is $2/3$. The conditional probability that the second ball is red given the first ball is red = $1/2$. Thus, the probability of both balls to be red is $1/3$.

It always bothered me, that here I implicitly use two sample space when I select balls. Specifically, $S_1 = \{R, R, B\}$ when select the first ball and $S_2 = \{R, B\}$ when I select the second one. Specifically, I conclude that $P(\mathrm{Ball}_1 = R) = 2/3$ using the sample space $S_1$. Similarly, I conclude that $P(\mathrm{Ball}_2 = R) = 1/3$ using the sample space $S_2$.

But -- and here is my main problem -- the sample space of the task is $S$. And I must use (to be formally correct) only $S$ to find probabilities $P(\mathrm{Ball}_1 = R)$ and $P(\mathrm{Ball}_2 = R)$. Of course, I can prove that $P(\mathrm{Ball}_1 = R) = |\{\text{outcomes where the first ball is red}\}|~/~|S| = 2/3$. Analogously, I can show that $P(\mathrm{Ball}_2 = R | \mathrm{Ball}_1 = R) = |\{\text{outcomes where both balls are red}\}| / |\text{outcomes where the first ball is red}| = 1/2.$

However, this way of reasoning is cumbersome and requires a lot of cognitive efforts to solve a simple task. Could you please tell me what is correct theoretical justification to use the first approach? To be more precise, how to justify theoretically that I use sample spaces $S_1$ and $S_2$ to derive probabilities defined on the sample space $S$.

Thank you very much for the help. If you will be able to answer my question you fixed my brain :) !

1

There are 1 best solutions below

5
On

In the 1st method there is 1 event (picking 2 balls together) from $S$. In the 2nd method there are 2 events which are not independent (picking 1 ball from $S_1$ then picking 1 ball from $S_2$).

Sample space(s) depend on your choice of method. There is not one sample space which applies for each problem, regardless of the method used.

For example, a 3rd method uses the fact that picking 2 reds is the same as leaving 1 black, the probability of which is $1/3$ using sample space $S_1$ instead of $S$.