Is the inequality of the random matrices correct?

141 Views Asked by At

I am not familiar with random matrices but I need to confirm the correctness of the inequality below.

Let $\xi_i\in\{\pm 1\}$ be independent random signs, and let $A_1,\ldots, A_n$ be $m\times m$ Hermitian matrices. Let $\sigma^2 = \|\sum_{i=1}^n Var[\xi_i]A_i^2\|$. Then $$Pr\bigg(\bigg\|\sum_{i=1}^n\mathbb{E}[\xi_i]A_i-\sum_{i=1}^n\xi_iA_i\bigg\|\geq t\sigma\bigg)\leq2m\exp(-t^2/2).$$

It is said to be cited from the paper "User-Friendly Tail Bounds for Sums of Random Matrices ". But I cannot find which results in that paper can imply the inequality. Is the inequality correct?

2

There are 2 best solutions below

0
On BEST ANSWER

It's not correct.

Take random variable $\xi_i$ as $\mathbb{P}(\xi_i=1)=1$ for each $i$. Then $Var[\xi_i]=0$ for each $i$ and thus $\sigma = 0$. l.h.s. hold with probability $1$ while r.h.s is smaller than $1$ is take $t$ sufficiently large.

4
On

So, it's a corollary of their Theorem 1.5. I'll restate that inequality and work it into the form you have.

To avoid a conflict of notation, I'm going to change the notation in the paper you linked such that $t\mapsto\alpha$ and $\sigma\mapsto\sigma_0$, and I will hold your notation fixed. Also, that theorem is stated for the rectangular case where $\mathbf{B}_k$ has dimension $d_1\times d_2$. Here we have $d_1=d_2=m$, and I've simplified my restatement of their theorem below to match the square case.

In that notation, the inequality in Theorem 1.2. reads

$$P\bigg( \bigg\| \sum_k \xi_k \mathbf{B}_k \bigg\| \geq \alpha\bigg) \leq 2m \cdot e^{-\alpha^2/2\sigma_0^2},$$ where $\xi_k$ are either independent standard Gaussian or independent Rademacher (which is the same as random signs as you have), and where $\sigma_0^2=\left\| \sum_k \mathbf{B}_k^2 \right\|$ (that's the square version of their definition of $\sigma_0$).

Now it's just a game of translating this equality into yours. First, let $t=\alpha/\sigma_0$. Then the inequality becomes $$P\bigg( \bigg\| \sum_k \xi_k \mathbf{B}_k \bigg\| \geq t\sigma_0\bigg) \leq 2m \cdot e^{-t^2/2}.$$ That's a first step.

Now let's deal with the $\mathbb{E}\xi$ sum. Clearly $\mathbb{E}\xi=0$, since $\xi$ takes values $\pm1$ with equal probability. So, the first sum $\sum_{i=1}^n \mathbb{E}[\xi]A_i$ is identically 0. Thus it does not affect the inequality at all. Even better, the variance of the Rademacher variates is $0.5\cdot 1^2 + 0.5\cdot (-1)^2=1$, so that $\sigma=\sigma_0$.

So, the inequality you have is exactly the same as the one here, with $A$ changed to $\mathbf{B}$ and some extra terms that have no effect. Let me know if any details need clarification and I'll edit those in.