Simple solution to coloured marble problem seems off?

122 Views Asked by At

I have a very basic marble problem (paraphrased):

There is a vase with 20 balls, of which 7 are red, 3 are blue, 1 is orange, and 9 are green. What are the odds of drawing a red, a blue, an orange and a green marble from the vase in this specific order ( $P(r b o g)$ )? We draw four marbles without putting them back.

My solution to this problem was straightforward. Using the conventional definition of a probability: $P(A) = \frac{|A|}{|\Omega|}$ if $\Omega$ is the set of all possible outcomes.

I then defined the event of drawing a red, a blue, an orange, a green marble as such. I labeled my marbles 1 through 20 and labelled the red ones 1 through 7, the blue ones 8 through 10, the orange one 11 and the blue ones 12 through 20.

If we call the event of interest $A$, then the set description of $A$ becomes:

$$A = \{(\omega_1, \omega_2, \omega_3, \omega_4), \omega_1 \in \{1, 2, ..., 7\}, \omega_2 \in \{8, 9, 10\}, \omega_3 = 11, \omega_4 \in \{12, 13, ..., 20\}\}$$

If we call the set of all possible outcomes $\Omega$, this set becomes:

$$\Omega = \{(\omega_1, \omega_2, \omega_3, \omega_4), \omega_i\in\{1,2,...,20\},\omega_k\neq\omega_j, k \neq j\}$$

The sizes of the sets are easily calculated as $|A| = 7*3*1*9 = 189$ and $|\Omega| = 20*19*18*17=116280$. With this, the probability of the event becomes $\frac{189}{116280} = 0.001625387$

Subsequently, I tried to check my work by doing a few simulations. I did three:

  1. A simulation in Python. I would generate a list with elements 1 through 20. Then I would let Python pick an item from the list and remove it four times. Thus, this would simulate drawing four marbles and not putting them back. I then checked whether I drew a red-blue-orange-green combo.

  2. A simulation in C++. I would draw four marbles at random (1-20) and only use the result of the random draw if it made up four different numbers.

  3. A simulation in C++. I would shuffle a list with elements 1 through 20 and then use the first four elements.

Simulation 2 confirms my result. Simulations 1 and 3 do not: they give probability 0.00158... when I let them iterate 100,000,000 times or more.

This seems strange, since I'm so sure of my maths. So the question becomes: is my calculation of probability incorrect, or my way of simulation?