Let $\epsilon_i \sim \text{Bernoulli}(p)$ and $X_i \sim \text{Normal}(0, \sigma^2 / n)$ for $i=1,\ldots,n$. I am interested in getting a sub-Gaussian type upper bound for $$ P\left(\sum_i \epsilon_i X_i > t\right). $$ Ideally, it would look something like \begin{align} P\left(\sum_i \epsilon_i X_i > t\right) \le K\exp\left(C\frac{-t^2}{2\sigma^2 p}\right) \tag{1} \end{align} for some constants $K$ and $C$, although I'm thinking this isn't true; I'm also not sure how to incorporate $n$ here into $K$ or $C$. It's important that the bound quantifies the deviation in terms of $p$, so that I have good control over what is going on as $p \to 0$. The furthest I've gotten is that, by iterated expectation and the usual Gaussian concentration stuff, \begin{align} P\left(\sum_i \epsilon_i X_i > t\right) \le E\left\{\exp\left(\frac{-t^2 n}{2\sigma^2 \sum_i \epsilon_i}\right)\right\} \tag{2} \end{align} which, for fixed $p$, gives $$ \limsup_n P\left(\sum_i \epsilon_i X_i > t\right) \le \exp\left(\frac{-t^2}{2\sigma^2 p}\right) $$ by the law of large numbers. But this approach doesn't seem strong enough, since a little bit of numeric investigation shows that (2) is quite a bit worse than (1) as $p \to 0$. Some quantification of how $p$ and $n$ play into the bound (2) would also be good.
2025-01-13 05:21:29.1736745689
Concentration inequalities for $P(\sum_{i=1}^n \epsilon_i X_i > t)$
246 Views Asked by guy https://math.techqa.club/user/guy/detail At
2
There are 2 best solutions below
0
On
Note we can write the MGF of a single element in the sum in the following way:
$$ Ee^{t\epsilon X} = p E e^{tX} + (1-p) E e^0 = 1 - p + pe^{t^2 \sigma^2 / 2n} \le e^{t^2 \sigma^2 / 2n} $$ so the variable is $\sigma^2/n$-sub-Gaussian. Since they are independent, the sum is $\sigma^2$-sub-Gaussian and the standard tail bound gives $$ \Pr(\sum_i \epsilon_i X_i > t) \le e^{-t^2 / 2 \sigma^2}. $$
Here's a couple of incomplete thoughts on the problem that may be useful. In case it is useful for others to see how you derive $(2)$, and to get some quantified tradeoffs between $n$ and $p$, we have for any $\eta>0$, \begin{align} \Pr_{X,\epsilon}\bigg(\sum_{i=1}^n \epsilon_i X_i>t\bigg)&=\mathbb{E}_{X,\epsilon}[\mathbf{1}\{\sum_{i=1}^n \epsilon_i X_i>t\}]\\ &=\mathbb{E}_{\epsilon}\bigg[\Pr_X\bigg(\sum_{i=1}^n \epsilon_i X_i>t\bigg)\bigg]\\ &\leq \mathbb{E}_{\epsilon}\bigg[\exp\bigg(\frac{-t^2n}{2\sigma^2\sum_{i=1}^n \epsilon}\bigg)\bigg]\\ &=\sum_{m=1}^n \exp\bigg(\frac{-t^2n}{2\sigma^2m}\bigg)\Pr_{\epsilon}\bigg(\sum_{i=1}^n \epsilon_i=m\bigg)\\ &= \sum_{m=1}^{np+\eta}\exp\bigg(\frac{-t^2n}{2\sigma^2m}\bigg)\Pr_{\epsilon}\bigg(\sum_{i=1}^n \epsilon_i=m\bigg)+\sum_{m=np+\eta+1}^{n}\exp\bigg(\frac{-t^2n}{2\sigma^2m}\bigg)\Pr_{\epsilon}\bigg(\sum_{i=1}^n \epsilon_i=m\bigg)\\ &\leq \exp\bigg(\frac{-t^2n}{2\sigma^2(np+\eta)}\bigg)+\Pr_{\epsilon}\bigg(\sum_{i=1}^m\epsilon_i>np +\eta\bigg)\exp\bigg(\frac{-t^2}{2\sigma^2}\bigg), \end{align} where we use the standard sub-Gaussian estimate $\Pr\bigg(\sum_{i=1}^n a_i X_i>t\bigg)\leq \exp\bigg(\frac{-t^2n}{2\sigma^2 \sum_{i=1}^n a_i^2}\bigg)$ for the first inequality.
Recall one version of the multiplicative form of the Chernoff bound (just the one on Wikipedia): for $0\leq \delta\leq 1$, $$ \Pr\bigg(\sum_{i=1}^n \epsilon_i>np(1+\delta)\bigg)\leq \exp\bigg(\frac{-\delta^2np}{3}\bigg). $$ Letting $\eta=np\delta$ for $\delta\in [0,1]$, using this in the above bound becomes $$ \Pr_{X,\epsilon}\bigg(\sum_{i=1}^n \epsilon_i X_i>t\bigg)\leq \exp\bigg(\frac{-t^2}{2\sigma^2p(1+\delta)}\bigg)+\exp\bigg(\frac{-\delta^2np}{3}-\frac{t^2}{2\sigma^2}\bigg). $$
This holds for all $n,p,$ and $0\leq \delta\leq 1$, and recovers your LLN bound as the second term vanishes as $n\to \infty$ when $\delta>0$, and then taking the limit as $\delta\to 0$. There is a weaker version with $0\leq \delta$, so $\delta$ could be chosen more judiciously for $n$ and $p$ if desired.
However, this bound doesn't capture the desired behavior that for any fixed $n$, as $p\to 0$, we should want the bound to tend to $0$. Unfortunately, I couldn't find a way to use these sorts of Chernoff bound to simultaneously get the proper large $n$, fixed $p$ behavior and $p\to 0$, $n$ fixed behavior. If the latter is the regime one is interested in, one can use the easy inequality, $$ \Pr(\sum_{i=1}^n \epsilon_i>0)=1-(1-p)^n\leq np. $$
Then the bound above would give for $p<1/n$, using $\eta=0$, $$ \Pr_{X,\epsilon}\bigg(\sum_{i=1}^n \epsilon_i X_i>t\bigg)\leq np\exp\bigg(\frac{-t^2}{2\sigma^2}\bigg), $$ which at the very least goes to $0$ with $p$, but only like $p=\exp(\ln p)$ as opposed to $\exp(-1/p)$ as one would prefer.
Last point: one can try going through the normal proof of Chernoff bounds using MGFs, but it gets a little unwieldy to optimize the bound for generic $p$, except in the easy special cases of $p=1$ and $p=0$, where one exactly recovers the usual Gaussian bound. If one can make clever, tight approximations to the MGF that makes it easier to optimize, this could furnish an answer.