Empirical Process and Azuma's Inequality

161 Views Asked by At

Given independent random variables $X_1,\dots,X_n$ and a family $\mathcal{F}$ of functions $f:\mathbb{R}\to [0,1]$, let $$Z(X_1,\dots,X_n)=\frac{1}{\sqrt{n}}\sup_{f\in \mathcal{F}}\left|\sum_{i=1}^n\left[f(X_i)-\mathbb{E}f(X_i)\right]\right|$$ Show that for any $t\geq 0$ $$\mathbb{P}\left(\left|Z-\mathbb{E}Z\right|\geq t\right)\leq 2e^{-\frac{t^2}{2}}$$

Attempt: Apparently I have to apply Azuma's inequality to $Z$. Given two random vectors $X=(X_1,\dots,X_k,\dots,X_n)$ and $X'=(X_1,\dots,X_k',\dots,X_n)$ which agree in all but the $k$-th coordinate, if $$\left|Z(X)-Z(X')\right|\leq \frac{1}{\sqrt{n}}$$ then we are done. For each $f\in \mathcal{F}$, define $F_f:\mathbb{R}^n\to [0,n]$ by $$F_f(X)=\sum_{i=1}^n f(X_i)$$ Then we have to show that $$\left|\sup_{f\in \mathcal{F}}\left|F_f(X)-\mathbb{E}F_f(X)\right|-\sup_{f\in \mathcal{F}}\left|F_f(X')-\mathbb{E}F_f(X')\right|\right|\leq 1$$ However, although it's clear that the difference above is at most $n$, why cannot it exceed $1$? Any comment will be appreciated. Thanks!

1

There are 1 best solutions below

3
On BEST ANSWER

The domain of the function $F_{f}$ you define should be $\mathbb{R}^n$ rather than $\mathbb{R}$.

Let me use slightly different notations. Define $\varphi$ by \begin{equation*} \varphi(x_1, \cdots, x_n) := \sup_{f \in \mathcal{F}} | \sum_{i=1}^n f(x_i) - \mathbb{E}[f(X_i)]| \end{equation*} so that $\varphi(X_1, \cdots, X_n) = \sqrt{n} Z(X_1, \cdots X_n)$. If two vectors $(x_1, \cdots, x_i, \cdots, x_n)$ and $(x_1, \cdots, x_i', \cdots, x_n)$ agree on all but the $i$th coordinate, \begin{equation*} | \varphi(x_1, \cdots, x_i, \cdots, x_n) - \varphi(x_1, \cdots, x_i', \cdots, x_n) | \leq \sup_{f \in \mathcal{F}} |f(x_i) - f(x_i')| \leq 1 \end{equation*} Then you can apply McDiarmid's inequality to deduce the exponential bound


Added:

The first inequality is derived as follows: \begin{align*} &\phantom{==}| \varphi(x_1, \cdots, x_i, \cdots, x_n) - \varphi(x_1, \cdots, x_i', \cdots, x_n) | \\ &= \bigg| \sup_{f \in \mathcal{F}} |(f(x_1) + \cdots + f(x_i) + \cdots f(x_n) - \sum_{j=1}^n \mathbb{E}[f(X_j)] | - \sup_{f \in \mathcal{F}} |(f(x_1) + \cdots + f(x_i') + \cdots f(x_n) - \sum_{j=1}^n \mathbb{E}[f(X_j)] | \bigg| \\ &\leq \sup_{f \in \mathcal{F}} \Bigg| \left\{ (f(x_1) + \cdots + \color{red}{f(x_i)} + \cdots f(x_n) - \sum_{j=1}^n \mathbb{E}[f(X_j)] \right\} - \left\{ (f(x_1) + \cdots + \color{red}{f(x_i')} + \cdots f(x_n) - \sum_{j=1}^n \mathbb{E}[f(X_j)] \right\} \Bigg| \\ &= \sup_{f \in \mathcal{F}} |f(x_i) - f(x_i')| \end{align*}