An Urn Simulator

336 Views Asked by At

I've written code to simulate drawing without replacement from an urn containing a different numbers of orbs of different colors. It uses random sampling. However, I need to double-check that the "correct" answer I was shown is correct, since my simulator gives something different.

My urn is

  • [3 blue, 2 red, 6 green]

I am drawing 4 orbs without replacement

My event is

  • [2 blue, 1 green]

This experiment is repeated 100 times, and I need to have at least the orbs in the event.
The expected probability is $27.2\%$. This site says there are 81 possibilities and lists 71 of them https://www.mathepower.com/en/urn.php . I can't figure out why I'm getting a probability of above 70%.

  1. Is the expected probability accurate?
  2. What might be the issue in my code logic (given below)?

logic (psuedo code)

run_experiment ==
  set trials = []
  do this 1000 times:
    copy the original Urn
    draw 4 from the urn randomly without replacement (uses Python random.sample(maxint, num_to_take))
    put that result in the trials container/list
  expected = bl bl gr
  matching = []
  things_that_match = [item from expected in item in one trial] e.g. [bl] or [gr] or [bl gr] etc
  make a list (call it eval) containing the 1000 lists of things_that_match
  make a new list containing True if things_that_match == expected, for every item in eval
  sum the trues
  trues/1000 = probability of 
   bl bl gr in 1000 trials of pick 4 from urn [3 blue, 2 red, 6 green] without replacement


code: https://gist.github.com/QuantVI/79a1c164f3017c6a7a2d860e55cf5d5b

2

There are 2 best solutions below

0
On

Solved: https://stackoverflow.com/questions/71277539/too-many-copies-poor-comparison-urn-probability-problem/71279128#71279128

Issue was with how matches were being confirmed. I couldn't fix the logic, so I resorted to changing the data structure involved to do the comparison. I was able to an approximate close to 27% (since there are only 1000 trials in the simulation).

1
On

As noted by joriki, the probability of the event in which at least 2 blue orgs and at least 1 green orb are drawn from sampling, without replacement, 4 orbs is $$ \frac{\binom{3}{3}\binom{2}{0}\binom{6}{1} + \binom{3}{2}\binom{2}{0}\binom{6}{2}+ \binom{3}{2}\binom{2}{1}\binom{6}{1}}{\binom{11}{4}}=\frac{87}{330}\approx 0.264 $$ Here is a simple R script that simulates the urn an counts the events the OP is describing.

    ######  Urn problem simulation
library(parallel)
numCores <- detectCores()
numCores
##
urn <- c(rep("g",6), rep("b", 3), rep("r",2))
k <- 4 # sample size
N <- 1e6 # number of repetitions of game
### check event happens
my_event <- function(urn,k){
  smp <- sample(urn,k, replace = F)
  ifelse(sum(smp =="b") >= 2 & sum(smp == "g")>=1,1,0)
}
## produce N simulations of the sampling and estimate frequency of event
freq_event <- function(urn,k,N){
  sum(unlist(mclapply(1:N, function(i){my_event(urn,k)},
                      mc.cores = numCores)))/N
}
#### simulate experiment N times
p_sim <-  lapply(1:1000, function(x){freq_event(urn = urn, k = k, N = N)})
hist(unlist(p_sim), freq = FALSE)
abline(v = 87/330, col ='red')
mean(unlist(p_sim)) # mean of all simulations
madeian(unlist(p_sim)) # median of all simulations
87/330. # true probability

Here is a picture if a histogram with a vertical line along the "real" probability of the event described above.

enter image description here