Verifying the Bernstein condition for random matrices

276 Views Asked by At

Theorem 6.2 of the following paper by J. Tropp states the matrix Bernstein inequality for the subexponential case. Given two i.i.d. sequences $X_1,\dots,X_n \sim N(0,\Sigma)$ and $Y_1,\dots,Y_n \sim N(0, \Gamma)$, I am trying to bound the maximal eigenvalue of $$ \sum_{i=1}^n Z_i=\sum_{i=1}^n X_i Y_i^T. $$ The summands are products of Gaussian vectors so they are subexponential and I should be able to use the Bernstein bound directly. I am a bit confused on verifying (and identifying the relevant parameters) in the condition: $$ E[Z_i^p] \preceq\frac{p!}{2} R^{p-2} A^2_k, \qquad p=2,3,4,\dots. $$ Is there a simple way to show that this holds?

An update a slightly more straight forward statement of the matrix Bernstein result is given in Wainwright's High Dimensional Statistics text (Theorem 6.17). It is stated as follows:

Let $\{ Q_i\}$ be a sequence of independent, zero mean and symmetric random matrices, and assume all matrices in the sequence satisfy the Bernstein condition with parameter $b>0$, that is, for $i=1,2,\dots,n$ \begin{align*} E[Q_i^p] \preceq \frac{1}{2}p! b^{p-2} \text{var}(Q_i), \quad p=2,3,4,\dots, \end{align*} where $\text{var}(Q) := E[Q^2]-(E[Q])^2$. Then for all $\varepsilon \ge 0$, \begin{align*} P \left(\frac{1}{n} \| \sum_{i=1}^n Q_i\| \ge \varepsilon \right) \le 2 \text{rank} \left ( \sum_{i=1}^n \text{var}(Q_i) \right ) \exp \left \{-\frac{n\varepsilon^2}{2(\sigma^2 + b \varepsilon)} \right \}, \end{align*} where \begin{align*} \sigma^2 := \frac{1}{n} \|\sum_{i=1}^n \text{var}(Q_i)\|. \end{align*}

The result is easier to work with since it gives us the Bernstein condition should hold with the variance matrix. For my problem, we can show that $$ \text{var}(Z_i) = \Sigma \Gamma, $$ and since $Z_i$ is a product of two gaussian vectors, we know it must be subexponential, so we can conclude that there exists some b>0 (that I do not know how to solve for) such that $$ E[Z_i^p] \preceq\frac{p!}{2} b^{p-2} \Sigma \Gamma. $$ Further, assuming $\Sigma, \Gamma$ are full rank, we have that $$ \text{rank}(\sum_{i=1}^n \text{var}(Z_i)) \le nd $$ where $d$ is the dimension of the $X_i$'s. We can therefore show that for some $b>0$ and for all $\varepsilon>0$ \begin{align*} P \left(\frac{1}{n} \| \sum_{i=1}^n Z_i\| \ge \varepsilon \right) \le 2 nd \exp \left \{-\frac{n\varepsilon^2}{2(\|\Sigma\Gamma\| + b \varepsilon)} \right \}, \end{align*}

This pretty much answers most of my question, but if anyone has a way to identify the correct $b$ that would be very helpful.

Edit 2 My previous approach is incorrect because I assumed that the $Z_i$ are symmetric which they of course are not. The correct way to use the Bernstein result is to define the block matrix:

$$ Q_i = \begin{bmatrix} O & X_iY_i^T\\ Y_iX_i^T & O\\ \end{bmatrix} $$

Then note that $Q_i$ is mean zero and has variance $\text{diag}(\text{tr}(\Gamma)\Sigma, \text{tr}(\Sigma)\Gamma)$.

The odd moments of $Q$ with $p=2k+1$ take the form: $$ Q_i^{2k+1} = \begin{bmatrix} O & \|X_i\|^{2k} \|Y_i\|^{2k} X_iY_i^T\\ \|X_i\|^{2k} \|Y_i\|^{2k} Y_iX_i^T & O\\ \end{bmatrix} $$ and the even moments $p=2k$ take the form $$ Q_i^{2k} = \begin{bmatrix} \|X_i\|^{2k-2} \|Y_i\|^{2k} X_iX_i^T & O\\ O &\|X_i\|^{2k} \|Y_i\|^{2k-2} Y_iY_i^T \\ \end{bmatrix} $$

I am pretty stumped on how to argue that the expectation of either of these matrices is less than or equal to the right hand side of the bernstein condition with respect to the semi-definite ordering.