Upper bound in information theory puzzle

256 Views Asked by At

Consider the following puzzle:

  • I have just flipped $n$ fair coins.
  • Before I start revealing them to you, you can ask me one yes/no question.
  • Then, you can make $n$ Head/Tail guesses.
  • What's your question and strategy for maximizing the number of correct guesses?

The baseline (no question asked a priori) is to guess only H (or T). This will give you $\dfrac{n}{2}$ correct guesses (in expectation).

A first question would be: "Is the first coin H?". If yes, guess only H, otherwise guess only T. So you now get $1 + \dfrac{n-1}{2} = \dfrac{n}{2} + \dfrac12$ correct guesses.

Another question: "Are there more H than T?" -- and just answer H or T depending on the answer. This strategy yields $n\cdot \Pr(H \geq T) = n\cdot\sum_{k=n/2}^{n} \binom{n}{k} \cdot \dfrac{1}{2^n}$ correct guesses (from Binomial PMF), which is the best so far.


My questions are the following:

  • How much more can we maximize the number of correct guesses? Is there a (tractable) way to quantify this (an upper bound)?
  • One yes/no question means one bit of entropy. Can we derive some sort of relationship between this quantity and the max correct guesses? For example, if one asks two questions (2 bits of entropy), how much better off are we? Clearly, if we're allowed $n$ bits of entropy, then we can guess the entire sequence.
2

There are 2 best solutions below

5
On

The correct question is “are there more heads than tails?” The guess is then obvious - either all heads or all tails.

We model this problem as follows. Define $2 = \{0, 1\}$; heads is represented by $0$ and tails by $1$. We are looking for a partition of $2^n$ into 2 sets, $A$ and $B$, and choosing two sequences $a$ and $b$ so as to minimise $F(A, B, a, b) = \sum\limits_{x \in A} d(x, a) + \sum\limits_{x \in B} d(x, b)$, where $d$ is the Hamming distance. We will treat $2$ as the group $\mathbb{Z} / 2 \mathbb{Z}$, which gives $2^n$ a group structure as well.

We then ask the question “Is the sequence in $A$?” If so, we guess the sequence $a$; otherwise, we choose the sequence $b$. Note that $F(A, B, a, b)$ counts the number of incorrect guesses over all possible states, with $\frac{1}{2^n} F(A, B, a, b)$ being the expected number of incorrect guesses; to maximise the number of correct guesses, we minimise the number of incorrect ones.

The first thing we can note is that when we minimise $F$, we WLOG have $a + b = 1$. This is because we should always pick $a_i$ to be the most common value of $x_i$ for $x \in A$, which is therefore the least common value of $x_i$ for $x \in B$ and therefore not equal to $b_i$; then $a_i + b_i = 1$. Translating back to English, this means that every guess we make should depend WLOG on whether the answer is “yes” or “no”; we should never make the same guess for coin $i$ in both cases.

The second thing to note is that WLOG, we can take $a = 0$. For if we have an optimal solution $(A, B, a, b)$, we can equally take $(A - a, B - a, a - a, b - a)$ and get an optimal solution. So there is an optimal solution with $a = 0$ and $b = 1$. In other words, without loss of generality, in the case of a “yes”, we should say “all heads”, and in the case of a “no”, we should say “all tails”.

Now that this is the case, it’s clear that we should put $x \in A$ iff $d(x, a) < d(x, b)$: that is, if and only if $x$ has more heads than tails. This completes the proof of optimality.

0
On

Just following up a little on my comments earlier:

For the question "More heads or tails", the resulting score is $S_n := \mathbb{E}[\max(H, n-H)],$ where $H \sim \mathrm{Bin}(n, 1/2)$. As $n \to \infty,$ this quantity grows as $n/2 + \sqrt{n/2\pi}(1 + o(1))$.

Non-asymptotically, for $n = 2m$ we have (essentially by a standard calculation $\sum_{j = 0}^m j \binom{2m}{j} 2^{-2m} = m/2$) that $$ S_{2m} = m + m \binom{2m}{m} 2^{-2m}.$$

For $n = 2m + 1$, we use a similar standard calculation, $2\sum_{j = 0}^m j \binom{2m+1}{j} 2^{-(2m + 1)} = (2m + 1)/2 - 2^{-(2m+1)} (2m + 1)\binom{2m}{m}$ to derive $$ S_{2m + 1} = \frac{2m + 1}{2} + \frac{(2m+1)}{2} \binom{2m}{m} 2^{-2m}.$$

In either case, the advantage over the mean is roughly $m^2$ times the $m$th Catalan number.

The $\sqrt{n/2\pi}$ advantage is then seen by the usual asymptotics of Catalan numbers: $$ \frac{1}{m+1}\binom{2m}{m} 2^{-(2m)} = \sqrt{\frac{1}{m^3\pi}} + O(m^{-5/2}). $$


This naturally leads to an analysis for the following "tensorised" scheme with $k$ queries, at least when $k \ll n:$

Divide the $n$ coins into $k$ groups of size (roughly) $n/k,$ and ask whether there are more heads or tails in each group. In each such group, play this majority.

Letting $H_i$ denote the number of these heads in group $i$, this scheme gets the score $$ S(n,k) = \sum_{i = 1}^k \mathbb{E}[\max(H_i, n/k - H_o)] = k \mathbb{E}[\max(\tilde{H}, \lfloor n/k\rfloor - \tilde{H})] + O(k),$$ where $\tilde{H} \sim \mathrm{Bin}(\lfloor n/k \rfloor, 1/2),$ and the final term is due to the possible rounding errors when assigning groups. As long as $\sqrt{n/k} \gg \max(1,\sqrt{k/n})$ i.e., if $k \ll {n},$ this can be expressed as $$ S(n,k) = n/2 + \sqrt{nk/2\pi} + O(k).$$