Proof check: Using Hanson-Wright inequality to concentrate a quadratic form $y^\top A y$ where both $y$ and $A$ are random but independent

230 Views Asked by At

Disclaimer. I don't know if this is the right venue to ask this. I'm working out a bigger proof, in a critical step, I'ved used an argument I'm not quite sure about.


Let $n$ be a large positive integer and let $X$ be a random $n \times n$ matrix such that such $\|X\|_{op} \le 1$. with probability $1-o(1)$. For concreteness, one may consider $X \sim N(0,s^2/n)^{n \times n}$ for an appropriate absolute constant $s>0$. Let $y$ be random vector in $\mathbb R^n$, with iid coordinates uniformly distributed on $\{\pm 1\}$, and independent of $X$.

Goal. I wish to argue that $\|Xy\|_2=\Omega(\sqrt{n})$ w.p $1-o(1)$.

Here is my argument. Let $\mathcal E$ be an event such that $\{\|X\|_{op} \le 1\} \subseteq \mathcal E^c$. Note that by assumption, $\mathbb P(\mathcal E^c) \ge \mathbb P(\|X\|_{op} \le 1) = 1-o(1)$. Let $A := XX^\top$. Considitioning on $\mathcal E^c$ and applying the Hanson-Wright inequality (see Theorem 1.1 of Rudelson and Vershynin), we know that there exists an absolute constant $K>0$ such that for all $t \ge 0$,

$$ \mathbb P(|y^\top A y - \mathbb E [y^\top Ay \mid \mathcal E^c]| \ge t \mid \mathcal E^c) \le 2\exp(-\frac{t^2}{K^4\|A\|_F^2}\land \frac{t}{K^2\|A\|_{op}}). \tag{1} $$

Further, conditioned on $\mathcal E^c$, one does the following computations

  • $\mathbb E[y^\top A y \mid \mathcal E^c]=\mbox{trace}(A) = \mbox{trace}(XX^\top) = \|X\|_F^2 \le n\|X\|_{op}^2 \le n$, since $\|X\|_{op} \le 1$ on the event $\mathcal E^c$.

  • Still using the fact that $\|X\|_{op} \le 1$ on $\mathcal E^c$, one computes $$ \begin{split} \|A\|_F^2 &= \mbox{trace}(XX^\top XX^\top) = \mbox{trace}((X^\top X)^2) = \sum_{i=1}^{n} \lambda_i(X^\top X)^2\\ & \le n\lambda_{\max}(X^\top X)^2 = n\|X\|_{op}^4 \le n. \end{split} $$

  • $\|A\|_{op} \le \|A\|_F \le \sqrt{n}$.

Taking $t=n/2$, the RHS of (1) simplifies to $2\exp(-\frac{n}{4K^4} \land \frac{\sqrt{n}}{2K^2})=e^{-\Omega(\sqrt{n})} = o(1).$ Putting things together then gives

$$ \begin{split} \mathbb P(\|Xy\|_2 \ge \sqrt{n/2}) &= \mathbb P(y^\top A y - n \ge -n/2) \ge \mathbb P(y^\top A y - \mathbb E[y^\top A y \mid \mathcal E^c] \le n/2)\\ &\ge \mathbb P(\mathcal E^c)\mathbb P(|y^\top A y - \mathbb E [y^\top Ay \mid \mathcal E^c]| \le n/2 \mid \mathcal E^c)\\ &= (1 - o(1))\cdot(1 - o(1)) = 1 - o(1). \end{split} $$

We conclude that $\|X^\top y\|_2 = \Omega(\sqrt{n})$ w.p $1-o(1)$.

Question 1. Is the above proof sound / correct ?

Question 2. In case my proof is incorrect, is the statement I'm trying to proof, i.e "$\|Xy\|_2 = \Omega(\sqrt{n})$ w.p $1-o(1)$" true ? (in which case I'd be interested in an alternative proof thereof).

Thanks in advance for any input.