How many Monte Carlo simulations must I run to get a $95\%$ confidence interval for some error E

208 Views Asked by At

Suppose I want to use Monte Carlo to compute some probability $p$. A single MC simulation will run for $R$ iterations and calculate $p$ as the fraction of 'successes' (each iteration gives failure and success).

Say I want to compute $p$ within an error of $E$ with a 95% confidence interval. That is, I want to find $R_0$ such that if I run the MC simulation for $R_0$ many iteration and obtain $p_0$, then I am $95\%$ confident that the true $p$ lies in $[p_0 - E, p_0 + E]$.

I found two possible formulas for this: one and two but they are different (albeit similar), and they also don't really seem to take $R$ into account, which doesn't make intuitive sense to me.

For instance, the second link has the formula:

$$\bigg(\frac{z_{\alpha/2} \cdot \text{std}(p)}{E}\bigg)^2$$

$\text{std}(p)$, I assume will be computed by Monte Carlo sampling $p_1, \dots, p_n$ (with some fixed $R$ iterations for each $p_i$), and then finding the standard deviation of the $p_i$. But naturally this standard deviation would decrease as $R$ increases. So it seems to me that the formula should factor in $R$ somehow, which it isn't.

Is my interpretation incorrect?

Is there a simple formula to determine number of simulations required?

1

There are 1 best solutions below

0
On

Is my interpretation incorrect?

You are correct that the standard deviation of the observed proportion of successes depends on $R$.

Let's say that $p$ is the "true" probability of success and $\hat{p}$ is the observed proportion of successes. As in your notation, let $R$ be the number of iterations.

Then, the total number of observed successes is $R\hat{p}$, and $R\hat{p}$ is a binomial random variable with variance of $Rp(1-p)$. It follows that the standard deviation of $\hat{p}$ is $\sqrt{\frac{p(1-p)}{R}}$.

Is there a simple formula to determine number of simulations required?

I assume you mean the number of iterations required. Wikipedia cites the standard formula using the normal approximation, which is (in your notation):

$$E=z_{\alpha/2} \sqrt{\frac{\hat{p}(1-\hat{p})}{R}}$$

Rearranging gives

$$R = \hat{p}(1-\hat{p}) \frac{z_{\alpha/2}^2}{E^2}$$

This is all explained in great detail in this sample size calculator web app I found after a google search. The formula above does still depend on $\hat{p}$, which generally you don't know. To deal with this, you can either estimate it via a pilot study with a small number of trials, or use $\hat{p}=0.5$, which will give you a conservative (high) estimate of the necessary $R$.