Find an example where bounded difference inequality is useful

216 Views Asked by At

Question: Is there an example, where the bounded difference property gives better concentration than what we get from using the sub-gaussian bound?

I could not find any examples myself. But I believe there must exist some, otherwise I see no point in introducing the bounded difference concentration inequality.


Definitions:

Bounded difference property. We say that $f:\mathbb{R}^n\rightarrow\mathbb{R}$ satisfies the bounded difference property, if for all $k\in[n]$ there exists a $L_k\geq 0$, such that for all $x\in\mathbb{R}^n$ and any $t\in\mathbb{R}$ it holds that $|f(x)-f(x+te_k)|\leq L_k$. Here, $e_k\in\mathbb{R}^n$ is the $k$-th unit vector.

One can show that $f$ satisfies the bounded difference property, if and only if $\|f\|_\infty<+\infty$. To be specific, if $f$ is bounded, using the triangular inequality, $$ |f(x)-f(x+te_k)|\leq 2\|f\|_\infty $$ On the other hand, suppose that $f$ satisfies the bounded difference inequality. Let $x^k:=(x_1,\ldots,x_k,0,\ldots,0)$. Then: $$ |f(x)|= \left|f(0)+\sum_{k=1}^n f(x^k)-f(x^{k-1})\right |\leq |f(0)|+\sum_{k=1}^nL_k $$ Since the choice of $0$ was arbitrary, we find: $$ \|f\|_\infty\leq \inf_{y\in\mathbb{R}} |f(y)|+\sum_{k=1}^nL_k $$

The following is Corollary 2.21 in Wainwright's "High-Dimensional Statistics":

Concentration from bounded difference property. Suppose that $f$ satisfies the bounded difference property, and that the random vector $X:=(X_1,\ldots,X_n)$ has independent components. Then, for all $t>0$, $$ \mathbb{P}\left[ |f(X)-\mathbb{E}[f(X)]|\geq t\right] \leq 2\exp\left(-\frac{2t^2}{\sum_{k=1}^n L_k^2}\right). $$

On the other hand, as we have shown above, any $f$ which satisfies the bounded difference property is also bounded. Hence, we can directly use a sub-gaussian tail bound. From (2.11) in Wainwright's "High-Dimensional Statistics", we get the following.

Sub-gaussian concentration for bounded random variables. If there exist $a,b>0$, such that $a<f(x)<b$, then, for all $t>0$: $$ \mathbb{P}\left[|f(X)-\mathbb{E}[f(X)]|\geq t\right]\leq 2\exp\left(-\frac{2t^2}{(b-a)^2}\right) $$

1

There are 1 best solutions below

0
On BEST ANSWER

I guess I have always thought of Hoeffding's inequality as applying to sums of random variables, and Bounded differences extending this to more general functions. I wasn't able to find the specific form you use to define sub-Gaussian concentration, what would that result say in the following example?

Let $X_1,\dots, X_n$ be i.i.d. real valued random variables with CDF $F$, and let $$ \Delta_n = \Delta_n(X_1,\dots,X_n) = \sup_x|F_n(x) - F(x)|, $$ where $F(x) = P(X\le x) $ and $F_n(x) = \frac{1}{n} \sum_{i=1}^n \mathbb{1}\{ X_i \le x \}$ are the CDF and empirical CDF respectively. Clearly $F_n$ and $F$ take values in $[0,1]$ and so $0\le\Delta_n\le 1$, so applying the sub-Gaussian concentration result in your post, we have $$ P(|\Delta_n - \mathbb{E}\Delta_n| > t) \le 2 \exp \left ( -2t^2\right), $$ if I'm not mistaken, there is no dependence on $n$ ?

Whereas, it is easily shown that perturbing one of the coordinates $X_i$ then $\Delta_n$ changes by at most $L_i = 1/n$ so by bounded differences: $$ P(|\Delta_n - \mathbb{E}\Delta_n| > t) \le 2 \exp \left ( -2nt^2\right), $$