Solution to Primer Coin Flip Game

235 Views Asked by At

I wonder how to find the optimal action for every possible situation in Primer's Coin Flip game. There is an interactive version as well. A coin has a $0.5$ probability of being a cheater coin and if it is it has a $0.75$ probability of heads instead of $0.5$ if it is fair. Now you have 3 possible actions: Label as cheater, label as fair or flip another coin. The reward for a correct label is $15$ points, $-30$ for a wrong one and a new flip costs $-1$ points.

My Approach

We can define the probabilities for the coin like this:

$P(C|H,T)=\frac{0.75^H0.25^T}{0.75^H0.25^T + 0.5^{H+T}}$ and

$P(F|H,T)=\frac{0.5^{H+T}}{0.75^H0.25^T + 0.5^{H+T}}$

for the probability of having a cheater or fair coin given the number of heads $H$ and tails $T$. Which gives for the expected value: $V(C|H,T) = 15\cdot P(C|H,T) + (-30)\cdot P(F|H,T)$ and

$V(F|H,T) = 15\cdot P(F|H,T) + (-30)\cdot P(C|H,T)$, respecively.

The expected value of the new coin flip could then be defined as:

$V(N|H,T)=\frac{(0.75^{H+1}0.25^T + 0.5^{H+T+1})V(H+1,T)+(0.75^H0.25^{T+1} + 0.5^{H+T+1})V(H,T+1)}{0.75^H0.25^T + 0.5^{H+T}}-1$, with $V(H,T)=max(V(C|H,T), V(F|H,T), V(N|H,T))$

$V(H,T)$ is now recursive and I don't know how to compute the exact value.

I have computed the value for $V(0,0)$ and it seems to be converging. Is there a way way to compute a function $V(H,T)$ with the result being the limit with infinite recursion?

Any input is very much appreciated!

(This is my first question so if there is anything else you might want to point out, please do so)

Thanks a lot in advance

1

There are 1 best solutions below

8
On

This is not an answer. I'm just recording how I've thought about this problem, and an approach I think could be useful.

I think this problem can be simplified somewhat. Firstly, the coin is always eventually guessed $C$ or $F$, so we can add $30$ to the ending score, and then you get $0$ for guessing incorrectly, and $w=45$ for guessing correctly.

Let the sequence of flips be $X = (X_1,X_2, \dots X_n)$. Then define the likelihood ratio after $n$ flips

$$r_n = \frac{P(C|X)}{P(F|X)} = \frac{P(X|C)}{P(X|F)} \frac{P(C)}{P(F)}$$

by Bayes's rule. $\frac{P(C)}{P(F)}$ is the prior likelihood ratio, $r_0$ (which in this case is $1$)

The updating rule is much nicer when written in terms of the likelihood ratio.

$$r_{n+1} = \frac{P(X, X_{n+1}|C)}{P(X, X_{n+1}|F)} \frac{P(C)}{P(F)}= \frac{P(X|C)}{P(X|F)} \frac{P(C)}{P(F)} \cdot \frac{P(X_{n+1}|C)}{P(X_{n+1}|F)} = r_n \frac{P(X_{n+1}|C)}{P(X_{n+1}|F)}$$

Then if $X_{n+1}= H$, $r_{n+1} = \frac32 r_n$ and if $X_{n+1}= T$, $r_{n+1} = \frac12 r_n$. Now consider the expected winnings starting from a given point (i.e. excluding the cost of flips you have already paid for). This only depends on the relative probabilities of $C$ and $F$, so we can write the value of that position as a function of just one parameter, $V(r)$. Then as you point out, we can say

$$V(r) = \max\{ V_C(r),V_F(r),V_N(r) \}$$

and we know

$$V_N(r)=P(H|r)V(\frac32 r) + P(F|r)V(\frac12 r)-1$$

We have

$$P(H|r) = P(H|C)P(C|r)+P(H|F)P(F|r) = \frac34 \frac r{r+1} + \frac12 \frac 1{r+1} = \frac{3r+2}{4r+4}$$

so we can find the recurrence relation explicitly in this one parameter form (clearly $V_C(r) = \frac{wr}{r+1}$ and $V_F(r) = \frac{w}{r+1}$ so we have all three). As such, we have the functional equation

$$V(r) = \max\{ \frac{wr}{r+1},\frac{w}{r+1},\frac{3r+2}{4r+4} V(\frac32 r) + \frac{r+2}{4r+4} V(\frac12 r)-1 \} \space (*)$$

This isn't quite the same as the original problem. The issue is that you can get solutions to this which never end up making a guess, where the value of $V(r)$ gets higher and higher the more turns you have (as you can still recover the number of flips from $r$, it is the power of $2$ in the denominator) so that you will always choose another flip. But this can be ruled out by saying the solution is bounded.

I'm pretty sure the functional equation then has a unique solution, and is equivalent to your question. In fact, more strongly, we can say the solution is continuous on $[0,\infty]$.

As long as $w$ is sufficiently large (and I'm sure $45$ is), the solution to this will have a left region where $V(r)=\frac{w}{r+1}$, a right region where $V(r)=\frac{wr}{r+1}$ and a middle region where

$$V(r) = \frac{3r+2}{4r+4} V(\frac32 r) + \frac{r+2}{4r+4} V(\frac12 r)-1$$

That funtional equation is linear, and I think it will have a finite dimensional set of solutions on any sufficiently large closed interval $[a,b]$. If so, solving this would solve your problem, as you could easily determine which solutions give a valid $V(r)$. I do not know if this functional equation has a closed form solution.


Edit: I think you can take the limit as the cost of a flip goes to $0^+$. Then all of $(0,\infty)$ is eventually in the middle region, but $0$ and $\infty$ still aren't. so we get the boundary conditions $V(0)=V(\infty)=w$. The middle region has to obey

$$V(r) = \frac{3r+2}{4r+4} V(\frac32 r) + \frac{r+2}{4r+4} V(\frac12 r)$$

which has a constant solution. The boundary conditions fix that the constant is $w$. However, this limiting procedure isn't valid if it turns out there isn't a unique solution to the whole functional equation, ($*$).


It's also noteworthy that if you chose a setup where the coin has a known bias, but the direction isn't known, we would find the effect of $H$ and $T$ on the log likelihood have the same size, and so the functional equation just turns into a recurrence relation.