Tighter tail bounds for subgaussian random variables

955 Views Asked by At
Let $X$ be a random variable on $\mathbb{R}$ satisfying $\mathbb{E}\left[e^{tX}\right] \leq e^{t^2/2}$ for all $t \in \mathbb{R}$. What is the best explicit upper bound we can give on $\mathbb{P}[X \geq x]$ for $x > 0$?

A well-known upper bound can be obtained by applying Markov's inequality to the moment generating function of X: For $t>0$, $$\mathbb{P}[X \geq x] = \mathbb{P}\left[e^{tX}\geq e^{tx}\right] \leq \frac{\mathbb{E}\left[e^{tX}\right]}{e^{tx}} \leq e^{t^2/2-tx}.$$ Setting $t=x$, yields $$\mathbb{P}[X \geq x] \leq e^{-x^2/2}.$$ How tight is this?

Examples of such $X$ include the standard Gaussian and Rademacher distributions. Both satisfy stronger tail bounds, as illustrated by the following plot. This makes me wonder whether stronger bounds hold for all such $X$.

enter image description here

Some more precise questions:

Define $$f(x) = \sup \left\{ \mathbb{P}[X\geq x] : X \text{ a random variable satisfying } \mathbb{E}\left[e^{tX}\right] \leq e^{t^2/2} \text{ for all } t \in \mathbb{R} \right\}.$$
  • What is $f(1)$?

    (We know $f(1) \in \left[\frac12,e^{-1/2}\right]$, by the Rademacher example and the above tail bound.)

  • What is $\limsup_{x \to \infty} f(x) \cdot e^{x^2/2}$?

    (We know it is $\leq 1$, by the above tail bound.)

  • What is $\limsup_{x \to \infty} f(x) \cdot e^{x^2/2} \cdot {x}$?

    (We know it is $\geq 1/\sqrt{2\pi}$, by the Gaussian example [link].)

Any improved bounds on $f$ would interest me. Is there a closed form expression for $f$? or some way to numerically approximate $f$?

A follow-up question would be: Are there are any natural assumptions that in addition to subgaussianity would imply $\mathbb{P}[X \ge x] \le O\left(\frac{e^{-x^2/2}}{x}\right)$.

2

There are 2 best solutions below

0
On

Using Saddlepoint approximations you can typically improve the bounds to be as sharp as you want.

If the moment generating function of a distribution is written as $M(t)$ and the cumulant generating function as $K(t) = \log(M(t))$ then the saddlepoint approximation to the PDF of a distribution is defined as: $$\hat{f}(x) = \frac{1}{\sqrt{2 \pi K''(\hat{s})}} \exp(K(\hat{s}) - \hat{s}x)$$

You should also check out Exponential tilting as a more general method for getting "medium deviations" bounds.

1
On

Consider a r.v. $X$ with distribution function $F(x) = (1 - e^{-\frac{x^2}2})I_{x \ge 0}$. Then $E e^{tX} \le \sqrt{2 \pi} e^{\frac{t^2}2} $ and condition $P(X \ge x) = e^{-\frac{x^2}2}$ holds for all $x \ge 0$. It's not a solution of the problem, but it shows that the estimate $ e^{-\frac{x^2}2}$ looks like a rather sharp estimate.