Deviating from the mean +-1 variables

68 Views Asked by At

I stumbled across the following exercise (the book Probability and Computing):

Let $b_1, b_2, \cdots b_{m/2}$ all equal 1 and $b_{m/2 + 1}, \cdots, b_m$ all equal $-1$. We now pick $Y_1, \cdots, Y_m$ independently and uniformly $1$ or $0$. Show that there exists a positive constant $c$ such that for sufficiently large $m$:

$\mathbb{P}\left(|\sum_{i=1}^m b_iY_i| > c\sqrt{m}\right) > 1/2$

The hint is to condition on the number of $Y_i$ that are 1.

Already, all the inequalities I know, go in the other direction, meaning $\mathbb{P}\left(\cdots\right) < c$. So it would be helpful if you could give me links to inequalities going in the good direction.

What I tried so far: The probability that the number of $1$'s we pick is outside of the range $[m/2 - d\sqrt{m}, m/2 + d\sqrt{m}]$ is very small (by some Chernoff bound for some $d$). So I only considered the case where we have between $m/2 - d\sqrt{m}$ and $ m/2 + d\sqrt{m}$ many points. Then, I would have liked to show that all these probabilities are more or less comparable. Also, in the case of picking exactly $n/2$ many points, the probability of deviating by more than $c\sqrt{n}$ is relatively large. However, I was not able to conclude yet since the calculations are very messy (estimating binomial coefficients) and I am not sure I can make the idea work / convinced there is a better way.

Any help is much appreciated.

1

There are 1 best solutions below

1
On BEST ANSWER

It is easier to convert it first into a single binomial summation. Let $X_1, \dots, X_m$ be independent $\pm1$-valued random variables; use them to define $Y_1, \dots, Y_m$ as $Y_i = \frac{1+X_i}{2}$ when $i \le \frac m2$ and $Y_i = \frac{1-X_i}{2}$ when $i>\frac m2$. Then the sum simplifies to $$\sum_{i=1}^m b_i Y_i = \sum_{i=1}^{m/2} \frac{1+X_i}{2} - \sum_{i=m/2+1}^m \frac{1-X_i}{2} = \frac12\sum_{i=1}^m X_i.$$

The most likely value for this sum is $0$, but this happens with a probability of $\frac{\binom{m}{m/2}}{2^m} \sim \sqrt{\frac{2}{\pi m}}$ as $m \to \infty$ by Stirling's formula. So we have $$ \Pr\left[\left|\frac12\sum_{i=1}^m X_i\right| \le c \sqrt m\right] \le (2c\sqrt m + 1) \Pr\left[\sum_{i=1}^m X_i=0\right] \sim 2c\sqrt{\frac 2\pi} $$ since there are $2c\sqrt m + 1$ integer values in this range, all of which are at most as likely as $0$.

Now first pick a $c$ for which this is less than $\frac12$, and then take $m$ sufficiently large that that we can ignore the error in the asymptotic approximation.

Also, if you want more general anticoncentration inequalities, Proposition 4 on this page (from Ryan O'Donnell's Analysis of Boolean Functions) is a good start. It is not quite strong enough to prove the result we want, because the sum in this problem is only $3$-reasonable and so the bound $\frac{(1-t^2)^2}{B}$ can't reach $\frac12$. But it's much more general than the approach above, which only works because of specifics of binomial random variables.