I've been thinking about the following game for a while and am curious if anyone has any ideas of how to analyze it.
Problem description
Say I have two biased coins: coin 1 that shows heads with probability $p$ and coin 2 that shows heads with probability $q$. You and I both know the statistics of the coins.
The game proceeds in multiple rounds as follows:
In the starting round $n=0$:
- I (privately) pick a coin and flip it, we both observe the outcome
- you decide to make a guess of which coin I just flipped, or continue watching
- if you guess correctly, I pay you $\$100$; if you guess incorrectly, you receive no reward and the game is over
At each subsequent round $n\ge 1$:
- I decide to stay with my current coin or reach into my pocket and swap out the current coin for the other coin
- you can see whether I swapped out the coin or not (assume I must switch if I reach into my pocket)
- I flip the coin and we both observe the outcome
- you decide to guess which coin was just flipped, or keep watching
- if you guess correctly, I pay you $\$100\cdot\delta^n$, where $\delta\in(0,1)$; if you guess incorrectly the game ends with you getting nothing
Question
I want to find the "best" switching strategy to minimize the (expected) amount of money I have to pay you.
Notes
The probabilities $p$ and $q$ can take on any value, but let's assume that they cannot be equal.
Since you are trying to maximize your reward, the discount factor $\delta$ incentives you to guess correctly as quickly as possible.
Since there are only two coins and you observe when I switch, you are trying to discern between two possible coin sequences, one where the initial coin was coin 1 and the other where the initial coin was coin 2.
My first thoughts are that I would want to keep the empirical averages (of the two sequences) as close as possible to each other. Intuitively this will be easy if $p$ and $q$ are close, but hard if they are far apart.
I'll think about the simpler situation where you can't switch the coins at all. I think this is complicated enough. After each coin flip, I have three choices -- guess coin 1, guess coin 2, or see another flip.
Intuitively, I think my strategy should have to do with:
The expected value of guessing coin 1 (with heads probability $p$) after seeing $n$ of $N$ flips come up heads is:
$\delta^N \times p(c_1|n, N) = \delta^N \times p(\theta = p|n, N) \propto \delta^N \times p(n, N|\theta = p)$
This is easily computable. The same can be done for coin 2. But what's the expected value of waiting to see another coin? If we had that, we could straightforwardly pick the option with the highest expected value.
I think it should also have something to do with the specific values of the flips that I've seen so far. Supposing $0 < p < q < 1$, here are some "corner" cases:
Generally, I can say, if $p(n, N|\theta = p) > p(n, N|\theta = q)$, I will either guess coin 1 or wait -- never guess coin 2. Whether to guess coin 1 or wait...not sure.