Optimal stopping in sampling without replacement

92 Views Asked by At

A colleague came up with the following question the other day:

I have $m$ each of blue and red balls in a bag. We play a game where we draw balls randomly (without replacement) and each time you draw a blue ball you get +1 dollar and each time you draw a red ball you get -1 dollar. You can stop drawing at any time. What's the optimal stopping rule for the game and how much would you pay to play it?

MY ATTEMPT:

So you can always draw all the way to the end and come out even. I took a stab at the optimal stopping rule and got this:

Stop only if you're positive and when (number of blue balls remaining) $\leq$ (the number of blue balls drawn) - (the number of red balls drawn)

I derived this by the heuristic "stop when the probability of gaining a dollar on the next draw is no greater than the probability of never getting back to where you are now, in any of the remaining draws". Here's a sketch of how I computed this:

Let $B_n$ be the number of blue balls drawn and $R_n$ the number of red balls drawn. The probability of +1 on the next draw is just the proportion of blue balls remaining, $(m - B_n) / (2m - n)$. The probability of never getting back to where you are now (assuming you're in the positive) is the solution to the Ballot problem with $m-R_n$ and $m-B_n$ votes, respectively--which simplifies to $(B_n - R_n) / (2m - n)$.

Simplifying this out gives the stopping rule $m - B_n \leq B_n - R_n$.

QUESTIONS:

  1. This feels intuitive but somehow missing something. I don't think I formulated the notion of optimality entirely correctly. Is there a better solution?

  2. How does one compute the expected value of winnings at this stopping time? It feels like there is an optional stopping theorem argument involving the "proportion of blue balls remaining" martingale, but I kept running into dead ends (maybe I had some math errors, though).