A lottery on an interval

155 Views Asked by At

You play the following game.

  1. There is a set $S_1=\{1,\dots,4n\}$.

  2. There is a lottery in which each element of $S_1$ is selected with probability 1/2, independently of the others. Call the subset of selected elements $S_2$.

  3. You select an interval $I$ of length exactly $2n$ in $S_1$, i.e, you select $k\in\{0,\dots,2n\}$ and the interval is $\{k+1,\dots,k+2n\}$.

  4. Your prize is the number of elements in $I\cap S_2$.

What is the largest prize that you can guarantee to yourself in probability at least $1-1/n$, by a sophisticated selection of an interval?

NOTES:

  • A. If you have to select the interval before the lottery (steps 2 and 3 are swapped), then your prize is a binomial variable distributed like $Binom[2n,1/2]$. Its expected value is $n$, and by known tail-bounds, with probability at least $1-1/n$ the prize is at least $n-O(\sqrt{n\log n})$. Here you can select the interval after seeing lottery outcomes, so you can potentially do better.
  • B. In contrast, if you are allowed to select any subset of size $2n$ of $S_1$ (not necessarily an interval) after seeing the lottery outcomes, then of course you can just select the $2n$ coins of $S_2$ and your prize is $2n$.

So I expect the answer to be something between $n-\sqrt{n\log n}$ and $2n$.


EDIT:

I have just learned about McDiarmid's inequality and it seems to be relevant. It relates to $N$ independent random variables, $X_1, X_2, \dots, X_N$. We are interested in some function of these variables, $f(x_1,\dots,x_N)$. If the change in the function is bounded (i.e, when the value of one of the $X_i$-variables changes, the change in $f$ is bounded by a constant $C$), then with high probability, the value of $f$ is bounded around its expected value:

$$ \Pr \left\{ |E[f(X_1, X_2, \dots, X_N)] - f(X_1, X_2, \dots, X_N)| \ge \varepsilon \right\} \le 2 \exp \left( - \frac{2 \varepsilon^2}{N C^2} \right). \; $$

In our case:

  • $N=4n$ (the total number of coins)
  • Each variable $X_i$ is 1 if the coin is selected and 0 if it is not selected.
  • The function $f$ is: $$ f(X_1,\dots,X_{4n}) = \max_{k\in\{0,\dots,2n\}} \left( \sum_{i=k+1}^{k+2n} X_i \right) $$
  • $C=1$, since when one of the X-variables changes, the change in the sum is at most 1.

Hence, we can use the inequality... if only we know the expected value of $f$.

So, to complete the answer, I only need to know: what is the expected value of:

$$\max_{k\in\{0,\dots,2n\}} \left( \sum_{i=k+1}^{k+2n} X_i \right) $$

??