Hypothesis testing: does Program $1$ play a game better than Program $2$

30 Views Asked by At

I have two programs for playing a $2$ player zero-sum perfect information game.

The game has a very high "branching factor".

No luck is involved, but game results are chaotic due to a rather large number of starting states, so when two programs play, the better program may win only a modest percent more often.

My question is how many games must I let them play, using random starting states, before I am $95\%$ confident that I have identified the better program?

This might be simple, but my statistics course was back in the 70s ;)

Alternate form of question: $X$ wins $255$ games and $Y$ only $245$.

How certain am I that $X$ is the better player?

1

There are 1 best solutions below

0
On

Looks like the "Sign Test" is what I need, and there's an online version at http://www.fon.hum.uva.nl/Service/Statistics/Sign_Test.html