A perplexing simulated conditional probability problem, non-simulated

54 Views Asked by At

I've been using random integers in google sheets with some formulas to simulate conditional probability problems. It takes a lot of processing power and becomes fairly limited in terms of sample size, and obviously doesn't give the exact right answer.

The question, when translated from gaming terms to a real-life scenario, goes like this:


Every X seconds, on average, I flip a coin. There is a 50% chance that its heads. If I land on heads, I must stop fliping my coin for Y seconds. If I land on tails, I can proceed to flip. I stop flipping after 360 seconds. Let's say X = 2 and Y = 8.

My end goal is a formula that I can input:

  1. The chance of success for each action | r = 50% | Chance of heads
  2. The rate of an action being performed | x = 2 | Rate of flipping a coin
  3. A specified time period | t = 360 | 6 minutes

And I will get in my output:

  1. The number of times that I have on average flipped heads in that specified time period.


I have asked some mathematically savvy people I know about this, and asked on a mathematics discord which directed me here :) I've thought about this problem for a while, and here's the link to a spreadsheet I made if it helps y'all understand. I'm new here, and if there are any unspoken rules or such that I've broken, it's not out of malice. Ask me anything about my question or clarifying comments I'll be happy to help! https://docs.google.com/spreadsheets/d/1qEKI2ZYZzlyJ1-IP6LYhzsK3spJ-ylAC_1lCvbOzORI/edit?usp=sharing

2

There are 2 best solutions below

3
On BEST ANSWER

My other answer was a simulation, and is easily generalised to other values of $X,Y,T$. It also gave a slightly biased but simple approximation.

This answer is a specific calculation, taking advantage of the fact that both $Y=8$ and $T=360$ are multiples of $X=2$.

First consider where you stop exactly at $360$ seconds after having seen $h$ heads.

  • You used $8h$ seconds waiting after the heads
  • leaving $360-8h$ seconds for waiting after tails
  • so you must have seen $180-4h$ tails
  • so $180-3h$ flips
  • which has probability ${180-3h\choose h} 2^{-180+3h}$

But you might not have hit $360$ seconds exactly if the final flip was a head, though you would with these values of $X,Y,T$ if it had been a tail; for example you might flip a head after $354$ seconds and the waiting might take you to $362$ seconds. So you might instead stop exactly at $362$ or $364$ or $366$ seconds after having seen $h$ heads, and the final flip was a head. Similar calculations will give probabilities respectively of

  • ${181-3h-1 \choose h-1} 2^{-181+3h}$
  • ${182-3h-1 \choose h-1} 2^{-182+3h}$
  • ${183-3h-1 \choose h-1} 2^{-183+3h}$

Finally consider where you stop exactly at $360$ seconds after having seen $h$ heads, and the final flip was a tail. This time you get a probability of

  • ${180-3h-1 \choose h} 2^{-180+3h}$

So the overall probability of stopping after seeing $h$ heads is

${180-3h \choose h} 2^{-180+3h} + {181-3h-1 \choose h-1} 2^{-181+3h} +{182-3h-1 \choose h-1} 2^{-182+3h} + {183-3h-1 \choose h-1} 2^{-183+3h}$

That makes the expected number of heads

$\sum\limits_{h=0}^{45} h \left({180-3h \choose h} 2^{-180+3h} + {181-3h-1 \choose h-1} 2^{-181+3h} +{182-3h-1 \choose h-1} 2^{-182+3h} + {183-3h-1 \choose h-1} 2^{-183+3h} \right)$

If you do the calculation, you get an expected number of heads of $36.24$

If you attempted a similar calculation for the expected number of tails, you would end up with the same result, making the expected number of flips $72.48$. These numbers are close to the simulated answers in my other answer.

I think you might be able to generalise the expectation of the number of heads to the following by considering whether the last flip was heads or tails, where $p$ is the probability of flipping heads:

$\sum\limits_{h=0}^{\lceil T/Y\rceil} h \left({h+ \lceil (T-Yh)/X\rceil -1 \choose h }p^h(1-p)^{\lceil (T-Yh)/X\rceil} +\sum\limits_{t=\lceil (T-Yh)/X\rceil}^{\lceil (T-Yh+Y)/X\rceil-1} {h+t-1 \choose h-1} p^h(1-p)^t \right) $

and the expected number of flips would be $\frac1p$ times this.

Note that for some $h,X,Y,T$ you may have the $\sum\limits_{t=\lceil (T-Yh)/X\rceil}^{\lceil (T-Yh+Y)/X\rceil-1}$ term being an empty sum if the upper limit is below the lower limit, though this is only a possibility when $Y < X$.

1
On

An approximate answer might be found by saying

  • If you flip tails, the next flip comes in $X$ seconds
  • If you flip heads, the next flip comes in $Y$ seconds
  • So the average time between flips is $(X+Y)/2$ seconds
  • And thus the average number of flips might be about $\frac{T}{(X+Y)/2}=\frac{2T}{X+Y}$
  • So the expected number of heads might be about $\frac{T}{X+Y}$

Both of the last two points are technically wrong, in that you may not stop at exactly time $T$ and you should not expect to be able to divide expectations that way, and the proportion of heads is negatively correlated with the number of flips. But for large $T$ compared with $X$ and $Y$, the error may be small.

As an illustration of the small bias, here is the result of $100000$ simulations in R. Changing the seed will change the result, but usually still show the small errors in the approximations of the expectations:

simulation <- function(maxtime, tailsdelay, headsdelay, 
         samplesize=ceiling(maxtime / min(tailsdelay, headsdelay))){
  flip <- sample(c(0,1), samplesize, replace=TRUE) # 1=heads 0=tails
  fliptimes <- cumsum(tailsdelay + flip*(headsdelay-tailsdelay))
  intime <- sum(fliptimes < maxtime)+1
  c(intime, sum(flip[1:intime]))
  }

set.seed(2020)
cases <- 10^5
results <- data.frame(flips=numeric(cases), heads=numeric(cases))
T <- 360
X <- 2
Y <- 8 
for (i in (1:cases)){
  results[i,] <- simulation(maxtime=T, tailsdelay=X, headsdelay=Y)
  }  
c(simulatedmeanflips = mean(results$flips), approxmeanflips=2*T/(X+Y))
# simulatedmeanflips    approxmeanflips 
#        72.45373           72.00000 
c(simulatedmeanheads = mean(results$heads), approxmeanheads=T/(X+Y))
# simulatedmeanheads    approxmeanheads 
#        36.24981           36.00000