Monte Carlo p-test and early stopping

138 Views Asked by At

Say you have a coin with some probability $p$ of falling on heads. You would like to determine if this probability is less than or equal to $0.05$ with some reasonable degree of confidence and stop as soon as you have that confidence.

This is meant to be a model for doing any Monte Carlo statistical test where you would like to reject with a $p$-value of $0.05$ or lower and where tests are expensive so you want to stop as early as possible.

What is known about the earliest time you can stop, depending on the outcome of the coin tosses and your desired degree of confidence?

1

There are 1 best solutions below

4
On

If your confidence level is $1-\alpha$, then the absolute earliest you could stop would be if you got a string of heads. The minimum length ($N_0$) of such a run is calculated as:

$N_0 = \lceil -\log_{20} \alpha \rceil \approx 1$ (if $\alpha = 0.05$) so if you get a heads on your first toss, you're done - which makes sense because there is only a 5% chance of this occurring.

Per OP Comment

Sequential testing is a vast area of statistics. There is no ONE method for determining when to stop. However, below is a relatively straightforward way to get a decent "rejection horizon". Note that we are now analyzing entire sample paths, not individual results. In this case, we are doing a bernoulli random walk, where the step sizes are 0 and 1.

What I did was simulate 1000 sample paths, each 50 tosses in length. I then set a "cutoff" for each time step, $t$, (ahead of time!) as the $1-k$ percentile of a $bin(t,.05)$. I had excel determine how many sample paths would have exceeded the cutoff at some point in the 50 tosses. I wanted this number to be <50, so that less than 5% of the trajectories cross the cutoff. I achieved this by modifying $k$ (downwards) until I got less than 50 trajectories hitting the rejection region.

Once you've set your "rejection horizon" for each time step (again, ahead of time), then you can start tossing until you either get to the end of 50 tosses or you hit the barrier. If you reach the end, you fail to reject the null hypothesis, otherwise, you reject.

I've attached a sample of 250 such trajectories along with the derived "rejection horizon" in (dotted red). To show you what this looks like. There is a lot of literature out there, so if you're serious, you should look into sequential testing.

enter image description here