Is there a tail bound for the sum of Bernoulli RVs where $\Pr[X_i = 1]$ is a decreasing function of $X_1, \dots, X_{i - 1}$?

434 Views Asked by At

Suppose I have a sequence $X_1, \dots, X_n$ of Bernoulli RVs with the property that for all $i = 1, \dots, n$, the function $$f(x_1, \dots, x_{i - 1}) := \Pr[X_i = 1 \mid X_1 = x_1, \dots, X_{i - 1} = x_{i - 1}]$$ is (non-strictly) decreasing. Is there an exponential tail bound for $X := \sum_{i = 1}^n X_i$?

Intuitively, there should be, because the dependence between the $X_i$'s only helps concentration. That is, if there are many 1's among the first $i-1$ variables, $X_i$ is more likely to be 0, and vice versa. But I can't figure out how to show such a bound. If you find one, you get an acknowledgement on my learning theory paper!

Approaches I tried: (1) coupling arguments (2) showing that the $X_i$'s are negatively associated, which implies (exponential) Chernoff bounds. For an example of (1), a random walk whose step sizes (depending on position) are biased towards the origin is concentrated, via coupling to an unbiased random walk (martingale). I think (2) is more promising, but there are not many tools for proving negative association. I did manage to prove negative association for a coupon-collector type scenario, where the variables clearly have the property above.

Claim: Suppose each day, each of $m$ species of butterflies appears independently (with probability, say, $1 / m$, it doesn't matter). Let $X_t = 1$ if on day $t$, you see a new species of butterfly you haven't seen before, and $X_t = 0$ otherwise. Then the variables $X_1, \dots, X_n$ are negatively associated (implying their sum obeys Chernoff bounds).

Proof: We make use of the following three lemmas about negatively associated random variables, which appear to be the only tools for proving negative association (NA). (1) If $\sum_{i = 1}^m Y_i \leq 1$ with probability 1, then the $Y_i$ are NA. (2) If I have several NA collections of random variables that are independent of each other, then their union is also NA. (3) If $Y_1, \dots, Y_n$ are NA, and I apply functions $f_1, \dots, f_m$ that are all non-decreasing to disjoint subsets of the $Y_i$, then the resulting random variables are NA. To prove the claim, let $X_{tj} = 1$ if butterfly species $j$ is seen on day $t$. Let $Y_{tj} = 1$ iff $X_{tj} = 1$ but $X_{sj} = 0$ for all $s < t$; that is, iff you see species $j$ for the first time on day $t$. For fixed $j$, the variables $\{Y_{tj}\}_{1 \leq t \leq n}$ satisfy (1), thus are NA. Similarly, for $j \neq k$, the collections $\{Y_{tj}\}_{1 \leq t \leq n}$ and $\{Y_{tk}\}_{1 \leq t \leq n}$ are independent. Hence by (2), all the $Y_{tj}$ are NA (e.g., $\{Y_{tj}\}_{1 \leq t \leq n, 1 \leq j \leq m}$ is NA). Now for $1 \leq t \leq n$, let $X_t = f(Y_{t1}, \dots, Y_{tm})$, where $f$ is the OR function, e.g., $f(Y_{t1}, \dots, Y_{tm}) = 1$ iff there exists a $j$ for which $Y_{tj} = 1$. That is $X_t = 1$ iff you see a new species on day $t$. Clearly, $f$ is a non-decreasing function; hence the $X_t$ are NA. $\square$

Thanks in advance! :)

2

There are 2 best solutions below

4
On BEST ANSWER

Here is a counter-example that appears to work. Define the two infinite sequences A = 1011011011011011011011 ... and B = 0100100100100100100100 ... and put a probability mass of 1/2 on each. The resulting process satisfies the OP condition, but A averages out to 2/3 while B to 1/3, hence no concentration.

9
On

UPDATE: this answer is wrong, see the correct (accepted) one. Leaving this up for pedagogical reasons only.

Since you allowed non-strictly decreasing functions, constant ones are allowed in particular. Thus, the process where $P(X_1=X_2=\ldots=X_n=0)=P(X_1=X_2=\ldots=X_n=1)=1/2$ is covered by this setting. In this case, you obviously don't get any non-trivial concentration. I'm afraid imposing strict decreasing won't make things any better: just infinitesimally perturb my example.