Just watched a game show with rules as follow:
6 people were to choose to enter Room A (200\$), Room B (300\$), or Room C (400$), in which the reward would be divided evenly among the people in the room. Each decision is anonymous that is each person does not know what room the other person chooses.
Which room should you choose to yield the best/safest chance of getting the most reward? (or at least not gain the least)? Is this calculable? I'm curious...
My crude thoughts: Assuming each person chooses a room equally likely, I presume
- the most likely scenario would be to have 2 people in each room,
- the least likely scenario would be all 6 people choose a single room.
Therefore I think going with room C anticipating 2 people choosing it yields the best chance of getting the most(?).
But if everyone thought like me then everyone would choose room C...
I don't have the solution for this, just thought it will be fun to see if there's an optimized mathematical strategy to this.
Let us find a mixed-strategy Nash equilibrium. Furthermore, since the game is symmetric with respect to the players, we require the Nash equilibrium to be symmetric. That is, we suppose everyone follows the strategy of choosing room $A$ with probability $p_A$, room $B$ with probability $p_B$, and room $C$ with probability $p_C$, where $p_A,p_B,p_C$ are nonnegative and satisfy $p_A+p_B+p_C=1$.
Define $$ W_A=\sum_{k=0}^5 \frac{200}{k+1}\binom 5k p_A^k(1-p_A)^{5-k}=\frac{200[1-(1-p_A)^6]}{6\cdot p_A} $$ $W_A$ is the conditional expectation of a player's winnings given that they chose room $A$. If we define $W_B$ and $W_C$ similarly, then everyone's expected winnings are $$ p_A\cdot W_A+p_B\cdot W_B+p_C\cdot W_C\tag1 $$ Now, imagine one player is considering deviating from the strategy $(p_A,p_B,p_C)$, and instead using the distribution $(p_A',p_B',p_C')$, while everyone else still uses $(p_A,p_B,p_C)$. The expected winnings for the deviating player would be $$ p_A'\cdot W_A+p_B'\cdot W_B+p_C'\cdot W_C\tag2 $$ with the same $W_A,W_B,W_C$ as before. The question is, can a player benefit by deviating? They certainly cannot benefit if $W_A=W_B=W_C$, because in this case, every choice of $(p_A',p_B',p_C')$ results in the same value of $(2)$. This implies that if we can solve the equation $W_A=W_B=W_C$, then we have found a strategy profile where no one can profit by deviating, which is by definition a Nash equilibrium. As long as all of the probabilities $p_A,p_B,p_C$ are positive*, the converse would hold as well; any profile with $W_A\neq W_B$ or $W_B\neq W_C$ could not be an equilibrium, since a player could benefit reducing their $p_X'$ to zero for all rooms $X$ such that $W_X$ is not maximal.
It turns out that the unique solution to $W_A=W_B=W_C$ is given by $$ (p_A,p_B,p_C)\approx (0.165, 0.490, 0.345) $$ I found this numerically using Mathematica's
NSolvefunction. Therefore, if an optimal player were to play this game many times, they would choose the \$200 room around $16.5\%$ of the time, the \$300 room about $49\%$ of the time, and the \$400 room $34.6\%$ of the time.* I will not go into the details of proving a Nash equilibrium assigns a positive probability to each room. This happens to be true since the prizes are close enough in value. If room A's reward was only \$1 while the other rooms' prizes remained the same, then certainly it would be optimal to have $p_A=0$. If this were the case, then $W_A=W_B=W_C$ would have no solution, and you would instead try to solve $W_B=W_C$.