First, I come from a Computer Engineering (software) background, not Math, so any Math formula that aren’t pemdas aren’t going to be super helpful or insightful for me.
Now, the problem I’m trying to solve. I’m porting some old software over to a new environment and I need the behavior to remain the same. But I’d like to improve the efficiency of the algorithm over the original naive approach.
Pass = 0
Fail = 0
while Pass < Total:
Percent = Random( 1 to 100 inclusive )
if Percent < 97
Pass += 1
else if Percent = 97
break
else if Percent > 97
Fail += 1
Pass = Total
return Pass, Fail
I'll try to state the algorithm in English here. Roll a 100 side dice until Total Passes have been achieved. If the result of the roll is less than 97 its a Pass, if the result is more than 97 its a Fail, if the result is 97 no more Fails can occur.
I’d like to completely get rid of the loop by calculating the probability of a Fail result for any number of Totals.
Failures = Weighted Random( 1 to Total, with weights[ 1 to Total ] )
I’m unsure how to calculate the weights. 3% of the time the Result should be Fail until 1% of the time Fail is no longer a valid choice. I wrote a program that ran the original algorithm with 3 as the Total. Before I got 3 Passes, I got this many failures, this frequently:
Fail Count, Frequency
0, 0.91353420
1, 0.08134433
2, 0.00486673
3, 0.00024296
4, 0.00001138
5, 0.00000039
6, 0.00000001
I couldn’t work out how to scale the weights to match.
Anyways, any help would be appreciated. I don’t mind reading up on specific math topics, but I’m unsure what this problem even falls under.
Edited to be clearer and added measured weights
Edit: This is the solution I came up with thanks to WW1 and Misha Lavrov's help below.
def P( n, t ):
return binomialcoeff( t + n - 1, n ) * pow( 0.96, t ) * pow( 0.03, n )
# 0 0.88473600
# 1 0.07962624
# 2 0.00477757
# 3 0.00023888
# 4 0.00001075
# 5 0.00000045
# 6 0.00000002
def S( n, t ):
sigma = 0
for k in 0 to t - 1 inclusive:
sigma += binomialcoeff( n + k, k ) * pow( 0.96, k ) * pow( 0.03, n )
return sigma * 0.01
# 0 0.02881600
# 1 0.00170544
# 2 0.00008469
# 3 0.00000380
# 4 0.00000016
# 5 0.00000001
# 6 0.00000000
The probability of $n$ failures when your total is $t$ is given by...
$$ P(n, t)= \binom{t+n-1}n(0.96)^t(0.03)^n+(0.01)\sum_{k=0}^{t-1}\binom{n+k}k(0.96)^k(0.03)^n $$ You will need to make a decision about what will be your highest value of $n$
Brief explanation
Either...