Consider the following puzzle:
- I have just flipped $n$ fair coins.
- Before I start revealing them to you, you can ask me one yes/no question.
- Then, you can make $n$ Head/Tail guesses.
- What's your question and strategy for maximizing the number of correct guesses?
The baseline (no question asked a priori) is to guess only H (or T). This will give you $\dfrac{n}{2}$ correct guesses (in expectation).
A first question would be: "Is the first coin H?". If yes, guess only H, otherwise guess only T. So you now get $1 + \dfrac{n-1}{2} = \dfrac{n}{2} + \dfrac12$ correct guesses.
Another question: "Are there more H than T?" -- and just answer H or T depending on the answer. This strategy yields $n\cdot \Pr(H \geq T) = n\cdot\sum_{k=n/2}^{n} \binom{n}{k} \cdot \dfrac{1}{2^n}$ correct guesses (from Binomial PMF), which is the best so far.
My questions are the following:
- How much more can we maximize the number of correct guesses? Is there a (tractable) way to quantify this (an upper bound)?
- One yes/no question means one bit of entropy. Can we derive some sort of relationship between this quantity and the max correct guesses? For example, if one asks two questions (2 bits of entropy), how much better off are we? Clearly, if we're allowed $n$ bits of entropy, then we can guess the entire sequence.
The correct question is “are there more heads than tails?” The guess is then obvious - either all heads or all tails.
We model this problem as follows. Define $2 = \{0, 1\}$; heads is represented by $0$ and tails by $1$. We are looking for a partition of $2^n$ into 2 sets, $A$ and $B$, and choosing two sequences $a$ and $b$ so as to minimise $F(A, B, a, b) = \sum\limits_{x \in A} d(x, a) + \sum\limits_{x \in B} d(x, b)$, where $d$ is the Hamming distance. We will treat $2$ as the group $\mathbb{Z} / 2 \mathbb{Z}$, which gives $2^n$ a group structure as well.
We then ask the question “Is the sequence in $A$?” If so, we guess the sequence $a$; otherwise, we choose the sequence $b$. Note that $F(A, B, a, b)$ counts the number of incorrect guesses over all possible states, with $\frac{1}{2^n} F(A, B, a, b)$ being the expected number of incorrect guesses; to maximise the number of correct guesses, we minimise the number of incorrect ones.
The first thing we can note is that when we minimise $F$, we WLOG have $a + b = 1$. This is because we should always pick $a_i$ to be the most common value of $x_i$ for $x \in A$, which is therefore the least common value of $x_i$ for $x \in B$ and therefore not equal to $b_i$; then $a_i + b_i = 1$. Translating back to English, this means that every guess we make should depend WLOG on whether the answer is “yes” or “no”; we should never make the same guess for coin $i$ in both cases.
The second thing to note is that WLOG, we can take $a = 0$. For if we have an optimal solution $(A, B, a, b)$, we can equally take $(A - a, B - a, a - a, b - a)$ and get an optimal solution. So there is an optimal solution with $a = 0$ and $b = 1$. In other words, without loss of generality, in the case of a “yes”, we should say “all heads”, and in the case of a “no”, we should say “all tails”.
Now that this is the case, it’s clear that we should put $x \in A$ iff $d(x, a) < d(x, b)$: that is, if and only if $x$ has more heads than tails. This completes the proof of optimality.