Hoeffding/Bernstein inequality for negatively correlated variables

162 Views Asked by At

Suppose $\xi_1, \ldots, \xi_n \in [0, N]$ be negatively correlated random variables with expectation $\mu$. Let $S_n = \sum_{i=1}^n \xi_i$. We want to prove that for any $\lambda \geq 0$, we have $$ P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2 \mu n N}}. $$ The following is my attempt. We want to use the property that $\xi_i$ are negatively correlated, the variance is a good way to reflect this property: $$ Var(S_n) \leq \sum_{i = 1}^n Var(\xi_i) \leq n \mu (N - \mu). $$ Then we can use Chernoff bound to connect the variance and the MGF of $S_n$ to achieve the exponential decay as follows: $$ P(S_n - ES_n \geq \lambda) \leq \inf_{t \geq 0} e^{-t \lambda} Ee^{t (S_n - E S_n)} \leq \inf_{t \geq 0} e^{-t \lambda} e^{t Var(S_n)}. $$ However, for getting the second $\leq$, we need $0 \leq N n t \leq 1$ by $0 \leq |S_n - E S_n| \leq nN$, which leads to a very constrained condition for $\lambda$. Does there exist an alternative way to prove the original inequality? Thanks in advance!

1

There are 1 best solutions below

2
On BEST ANSWER

In the best case that $S_n$ is strictly sub-Gaussian (i.e., its variance proxy is equal to the its variance), we can prove the inequality (actually, it does not generally hold; see the point coming at the end). In this case, we have

$$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2 Var(S_n)}}$$

where $S_n = \sum_{i=1}^n \xi_i$. Assume that $\xi_1, \ldots, \xi_n$ are equally correlated with $Var(\xi_i)=\sigma^2$ and $Cov(\xi_i,\xi_j )=\rho \sigma^2, i\neq j$. Then,

$$Var(S_n)=n \left( 1-(n-1) \rho \right)\sigma^2 \le n \left( 1+(n-1) \rho \right)(N-\mu)\mu\le n \left( 1+(n-1) \rho \right)\frac{N}{4}^2.$$

Thus, we have

$$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2n \left( 1+(n-1) \rho \right)(N-\mu)\mu}}.$$

Note that the correlation $\rho$ needs to satisfy $1+(n-1) \rho>0$, and cannot be arbitrarily close to -1.

Finally, for $\rho\leq 0$, we have

$$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2n \left( 1+(n-1) \rho \right)(N-\mu)\mu}} \leq e^{-\frac{\lambda^2}{2nN\mu}}.$$

Moreover, consider that the following $$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2n (N-\mu)\mu}} \leq e^{-\frac{\lambda^2}{2nN\mu}}$$can be obtained only by assuming that $\xi_1, \ldots, \xi_n$ are strictly sub-Gaussian because the variance proxy of the sum of (dependent or independent) sub-Gaussian random variables equals the sum of their variance proxies (a proof is given here; more details on strict sub-Gaussianity can be found here)

If $S_n$ or all of $\xi_1, \ldots, \xi_n$ are not strictly sub-Gaussian, considering that $S_n$ is sub-Gaussian with variance proxy $\frac{nN^2}{4}$, the following can be obtained:

$$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{2\lambda^2}{n {N}^2}},$$

in which the dependence structure is not used. There are some results on how Hoeffding's bounds can be adjusted for dependent random variables (see 3 and 4 and references therein).

I need to remark that the variance proxy of a sub-Gaussian random variable is always greater than or equal to its variance. Hence, even for $n=1$, the inequality $$P(S_n - {E}(S_n) \geq \lambda) \leq e^{-\frac{\lambda^2}{2n (N-\mu)\mu}}$$ can be violated for some values of $\lambda$; see the counterexamples provided here. However, it may be possible to prove it for a limited range of $\lambda$ values.