Proof of Chernoff's Inequality

300 Views Asked by At

For an exercise, I have to prove Chernoff's inequality:

$$\operatorname{P}(X \ge a) \le e^{-ta} \operatorname{M}_X(t)$$

The exercise specifies that X is a random variable, $\operatorname{M}_X(t)$ is its moment-generating function (which is finite around a small interval containing $0$), and that I must prove that the inequality holds for every $t \ge 0$.

I don't know whether the spirit of the exercise is to use integrals and end with a long, exhaustive proof, or whether I should use something shorter and simpler. Which is totally what I did.

I thought that the inequality was kind of similar to Markov's Inequality (which, by the way, only holds for positive random variables, and I think is one of the problems of my proof). Here is Markov's:

$$\operatorname{P}(X \ge c) \le \frac{\operatorname{E}(X)}{c}$$

So I went ahead and derived:

$$\begin{align} \operatorname{P} (X \ge a) & = \operatorname{P} (e^{tX} \ge e^{ta}) & \text{because } e^{kx} \text{ is monotonous} \\ & \le \frac{\operatorname{E}(e^{tx})}{e^{ta}} & \text{Markov's inequality} \\ & = e^{-ta} \operatorname{E}(e^{tx}) \\ & = e^{-ta} \operatorname{M}_X(t) & Q.E.D \\ \end{align}$$

This proof clearly ignores the fact that $X$ can be negative, of the "$\operatorname{M}_X(t)$ finite around a small interval containing $0$". It does hold for every $t \ge 0$, though.

Firstly, is this proof correct? If not – what is wrong with it?

Furthermore, how can I make it better retaining the "short and sweet" idea?

Finally, are there any other longer and more exhaustive proofs (that, for example, include integrals when using $\operatorname{E}(X)$ or $\operatorname{E}(e^{tX}) = \operatorname{M}_X(t)$)?

2

There are 2 best solutions below

1
On

Let $(\Omega,\mathcal{F},P)$ be a probability space. Let $X:\Omega\rightarrow\mathbb{R}$ be a random variable and let $\mu$ be the probability measure on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ induced by $X$: $\mu(A)=P\left(X^{-1}(A)\right)$, $A\in\mathcal{B}(\mathbb{R})$. Let $M:\mathbb{R}\rightarrow[0,\infty]$ be the moment generating function of $X$, defined by \begin{eqnarray*} M(t) & = & \int\exp(tX)\,dP\\ & = & \int\exp(tx)\,d\mu(x). \end{eqnarray*} Let $a\in\mathbb{R}$. Then we have: $P\left(\left[X\geq a\right]\right)\leq e^{-ta}M(t)$ for any $t\in[0,\infty)$.

Proof: Fix $a\in\mathbb{R}$ and $t\in[0,\infty)$. Firstly observe that for any $x\in\mathbb{R}$, $$ 1_{[a,\infty)}(x)\leq e^{t(x-a)}. $$ For, if $x<a,$ then $LHS=0$ while $RHS>0$. If $x\geq a$, $LHS=1$ while $RHS\geq1$ because $t(x-a)\geq0$. Integrating both sides, we have $$ \int1_{[a,\infty)}(x)\,d\mu(x)\leq\int e^{t(x-a)}\,d\mu(x) $$ (It may happen that RHS = $\infty$). Note that $\int1_{[a,\infty)}(x)\,d\mu(x)=P\left([X\geq a]\right)$ and $\int e^{t(x-a)}\,d\mu(x)=e^{-ta}\int e^{tx}\,d\mu(x)=e^{-ta}M(t)$. Q.E.D.

0
On

This is just a comment:

It is wrong to assume that $M(t)<\infty$ for all $t$ in a neighborhood around $0$. In fact, we can construct a random variable $X$ such that its moment generating function $M_{X}(t)$ has the property that $M_{X}(0)=1$ and $M_{X}(t)=\infty$ for all $t\in\mathbb{R}\setminus\{0\}$.

Proof: Instead of constructing the random variable $X$, we define a probability measure $\mu$ on $\mathbb{R}$ such that $M(t)$ defined by $M(t)=\int\exp(tx)d\mu(x)$ has the property that $$ M(t)=\begin{cases} 1, & \mbox{ if }t=0\\ \infty, & \mbox{ if }t\neq0 \end{cases}. $$ For each $x\in\mathbb{R}$, let $\delta_{x}$ be the Dirac measure at $x$. That is, $$ \delta_{x}(A)=\begin{cases} 1, & \mbox{ if }x\in A\\ 0, & \mbox{ if }x\notin A \end{cases}. $$ Define $$ \mu=\frac{1}{2}\sum_{n=1}^{\infty}\frac{1}{2^{n}}\delta_{2^{n}}+\frac{1}{2}\sum_{n=1}^{\infty}\frac{1}{2^{n}}\delta_{-2^{n}}. $$ Clearly $\mu$ is a probability measure on $\mathbb{R}$. Note that $M(0)=\mu(\mathbb{R})=1$. Let $t\neq0$. We go to show that $M(t)=\infty$. Observe that $$ M(t)=\frac{1}{2}\sum_{n=1}^{n}\frac{1}{2^{n}}\exp(2^{n}t)+\frac{1}{2}\sum_{n=1}^{\infty}\frac{1}{2^{n}}\exp(-2^{n}t). $$ If $t>0$, we have $\sum_{n=1}^{n}\frac{1}{2^{n}}\exp(2^{n}t)=\infty$ because $\lim_{n\rightarrow\infty}\frac{1}{2^{n}}\exp(2^{n}t)=\infty$. If $t<0$, we have $\sum_{n=1}^{\infty}\frac{1}{2^{n}}\exp(-2^{n}t)=\infty$ because $\lim_{n\rightarrow\infty}\frac{1}{2^{n}}\exp(-2^{n}t)=\infty$. It follows that $M(t)=\infty$.