Why does Hanson-Wright inequality give a poor bound in this example?

275 Views Asked by At

The following is from High-Dimensional Probability by Roman Vershyni

(Hanson-Wright inequality) Let $X = (X_1, \dots, X_n) \in \mathbb{R}^n$ be a random vector with independent, zero-mean, and sub-Gaussian coordinates. Let $A$ be an $n \times n$ matrix. Then, for every $t\geq 0$, we have $$P(|X^\intercal A X - E[X^\intercal A X ]| \geq t) \leq 2 \exp \Big[ -c \min (\frac{t^2}{K^4 \|A\|_F^2}, \frac{t}{K^2 \|A\|_2})\Big],$$ where $K = \max_i \|X_i\|_{\psi_2}$.

I give an example where this bound is quite poor. Suppose that $A$ is a positive definite matrix and $X_i$ are i.i.d. standard Gaussian. For any vector $X$ it is trivial that $P(X^\intercal A X \leq 0) = 0$. I apply Hanson-Wright to this probability:

\begin{align} P(X^\intercal A X \leq 0) = P(- X^\intercal A X + E[X^\intercal A X] \geq E[X^\intercal A X]) = P(- X^\intercal A X + E[X^\intercal A X] \geq \text{tr}(A)]) \end{align}

Hanson-Wright gives the following bound \begin{align} P(- X^\intercal A X + E[X^\intercal A X] \geq \text{tr}(A)]) \leq \Big[ -c \min (\frac{\text{tr}(A)^2}{K^4 \|A\|_F^2}, \frac{\text{tr}(A)}{K^2 \|A\|_2})\Big] \end{align}

Now, this bound is poor when $A$ is not well-conditioned, even though the correct probability is zero. For example, if eigenvalues of $A$ are $(1, \epsilon/(n-1), \dots, \epsilon/(n-1))$, then

$$\frac{\text{tr}(A)}{\|A\|_2} = 1+\epsilon.$$

Why does this bound poor in this example? Simulations show that this is also true when $A$ is nearly positive definite (e.g. $A = B^\intercal B - \epsilon x x^\intercal$).

1

There are 1 best solutions below

0
On

For the lower tail that you are interested in, there is a sharper bound: $P(X^TAX - trace[A] < - 2 t ) \le e^{ - t^2/\|A\|_F^2}$. If you take $t$ of order $trace[A]$ you obtain something more reasonable.

See Equation (4.2) in Laurent and Massart (2000.