I am looking at a sample of tests which test whether or not a program runs to completion or fails.
In the first run of tests (around $1000$), we saw a failure rate of around $0.13$. There was then a fix applied to the program, and I'm trying to figure out how many tests would have to be run to have $95\%$ confidence that the program would not fail again, and therefore that the fix has succeeded.
My initial thoughts were to modify the binomial distribution, setting the number of trials equal to the number of successes. This just leaves you with $p^x = P(X)$. I wasn't sure if this would be an accurate test for the data that I wanted to run though, because I wasn't sure if the binomial distribution can work correctly if you are assuming there are $0$ failures while also assuming the success rate is less than $1$. If you have any suggestions for other tests I could look at that might work, I would appreciate any help.
No amount of testing will guarantee the program will never fail, but you can test enough to verify that it is unlikely that the failure rate is $0.13$ or more.
If we assume (the null hypothesis) that the failure rate is $p=0.13$ and that the tests are independent, then the number of failures in $n$ trials follow a Binomial distribution with parameters $n$ and $p$, so the probability of zero failures in $n$ trials is $(1-p)^n$. If we would like that probability to be less than $0.01$, say, then we want to choose $n$ so that $$(1-p)^n < 0.01$$ or equivalently, $$ n > \frac{\log(0.01)}{\log(1-p)}$$ For $p = 0.13$, $n \ge 34$ will work, so if you observe zero failures in $34$ tests, it is unlikely that the failure rate is still $0.13$.