N perfect Logicians box opening game

62 Views Asked by At

Question:

There is a game that involves $n$ ordered boxes each with a hidden value associated with it. The value is sampled from a probability distribution density function $P(x) = \frac{1}{\sqrt{2\pi}} e^{-\frac{1}{2}x^2}$. You observe each box’s value in order from box 1 to $n$. After observing any given box’s value, you must decide to pick it or leave it, forever discarding the option to choose its value again. A player wins this game if and only if they pick the box with the highest sampled value.

There is a group of perfect Logicians $L_1, L_2, \ldots, L_m$ competing in the game. The only information they know is that they are all perfect logicians, the rules of the game, and if any other logicians have lost or won the game. They play the game in order starting with $L_1$. The game resets for each logician, but the random sampling is the same.

a) Prove that if $m > n$, a win would always occur

b) We find out that only $L_m$ wins the game. What is the winning box? Give your answer in terms of $n, m$.

Thoughts so far:

Initially, I was confused about how any given logician is influenced by the results of previous logicians since they can't see the outcome of their decisions, only whether they won or lost. My best guess is that, for instance, if Logician 2 knows Logician 1 lost, then Logician 2 can infer that the optimal choice Logician 1 would have made led to failure. Therefore, by the time Logician 2 encounters the box that Logician 1 would have chosen using the optimal strategy, he would know that this particular box was not the winning choice. Additionally, logicians can observe the number of failures preceding their turn, although I’m not sure how this would impact their decision-making process.

For part a, my intuition tells me that logicians can always deduce the choices made by their predecessors because the outcome is deterministic, regardless of the random seed. Since each logician knows the choices of the previous logicians, once the number of logicians $m$ equals the number of boxes $n$, they would, by the pigeonhole principle, be guaranteed to find the solution.

For part b, I'm uncertain how the distribution of random sampling affects the optimal stopping problem. Is there a more optimal strategy when the distribution is known? Furthermore, once an optimal strategy is discovered, how does this information benefit subsequent logicians in their turn order?

Edit: The logicians are not working together nor have they discussed a strategy beforehand together. They wish only to maximize their own chances of winning