Calculating the variance of $x^TA x $

233 Views Asked by At

I checked myself 3 times but I can't seem to find the mistake, would appreciate any input...

Let $x \in \{ \pm 1 \}^{2}$ be a random vector where $\Bbb P (x_1 = 1) = \Bbb P (x_2 = 1) = \frac12$ and let $$A = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}$$ be some matrix. Find $\mbox{Var} \left(x^T A x \right)$.


The correct answer is $2a_{12}^2 + 2a_{21}^2$. Here's my attempt:

$$\Bbb E[x^TAx] = \Bbb E[\begin{pmatrix}x_1 & x_2\end{pmatrix}\begin{pmatrix}a_{11} & a_{12}\\a_{21} & a_{22}\end{pmatrix}\begin{pmatrix}x_1 \\ x_2\end{pmatrix}] = \Bbb E[a_{11}x_1^2+a_{12}x_1x_2+a_{21}x_1x_2+a_{22}x_2^2]$$

From independence of $x_1,x_2 $ we have $E[x_1x_2] = E[x_1]E[x_2] = 0$ so the above simplifies to $a_{11}E[x_1^2]+a_{22}E[x_2^2]$, these expectations are $1$, so overall $E[x^TAx] = a_{11}+a_{22}$

$E[(x^TAx)^2] = E[(a_{11}x_1^2+a_{12}x_1x_2+a_{21}x_1x_2+a_{22}x_2^2)^2]$ which by using independence again and $E[x_i^2] = E[x_i^4] = 1$ is equal to $a_{11}^2+2a_{11}a_{22}+a_{12}^2+2a_{12}a_{21}+a_{21}^2+a_{22}^2$

So the variance should be $Var[x^TAx] = E[(x^TAx)^2] - E[x^TAx]^2 = a_{11}^2+2a_{11}a_{22}+a_{12}^2+2a_{12}a_{21}+a_{21}^2+a_{22}^2 - (a_{11}+a_{22})^2 = (a_{12}+a_{21})^2$

Which is not the result we wanted, as the actual correct answer is $2a_{12}^2 + 2a_{21}^2$

I can't seem to find the mistake.

1

There are 1 best solutions below

0
On

The paper you're reading is all about estimating the trace of a symmetric matrix, and Hutchinson's method (as described in Lemma 2.1 of that paper) is also stated for the case where the matrix $A$ is symmetric, though I admit the paper is not clear about stating this assumption upfront.

In the $2 \times 2$ case, if $a_{12} = a_{21}$, then the expressions $(a_{12} + a_{21})^2$ and $2a_{12}^2 + 2a_{21}^2$ give the same result.

In general, the formula $$\text{Var}[\mathbf z^{\mathsf T}\!A \mathbf z] = 2 \left(\|A\|_F^2 - \sum_{i=1}^n A_{ii}^2\right) = 2\sum_{i \ne j} A_{ij}^2$$ is only valid for symmetric $n \times n$ matrices. The general formula is $$\text{Var}[\mathbf z^{\mathsf T}\!A \mathbf z] = \sum_{i<j} (A_{ij} + A_{ji})^2.$$ To get this expression, replace the matrix $A$ by the symmetric matrix $B = \frac12(A + A^{\mathsf T})$: this should not change the variance, since $\mathbf z^{\mathsf T}\!A \mathbf z = \mathbf z^{\mathsf T} \!B \mathbf z$. Applying the formula for $\text{Var}[\mathbf z^{\mathsf T}\!B \mathbf z]$ gives the expression $$ 2\sum_{i \ne j} B_{ij}^2 = 2 \sum_{i \ne j} \left(\frac{A_{ij} + A_{ji}}{2}\right)^2 = \frac12 \sum_{i\ne j} (A_{ij} + A_{ji})^2 = \sum_{i<j} (A_{ij} + A_{ji})^2. $$ (This is also convincing evidence that the original formula was only intended for symmetric matrices; if it were correct for all matrices, then replacing $A$ by $B$ would not have changed anything.)


We can also derive this formula from scratch. The quadratic form $\mathbf z^{\mathsf T}\!A \mathbf z$ expands out to $$\sum_{i=1}^n A_{ii}^2 + \sum_{i < j} (A_{ij} + A_{ji}) \mathbf{z_i z_j}.$$ The first sum is a constant and will not affect the variance. In the second sum, each $\mathbf{z_i z_j}$ is also a Rademacher random variable (equally likely to be $+1$ or $-1$) and the different random variables in the sum are pairwise independent (though not mutually independent) which is enough for their variances to be additive. So we have $$ \text{Var}[\mathbf z^{\mathsf T}\!A \mathbf z] = \sum_{i < j} \text{Var}[(A_{ij} + A_{ji}) \mathbf{z_i z_j}] = \sum_{i<j} (A_{ij} + A_{ji})^2 \text{Var}[\mathbf{z_i z_j}] = \sum_{i<j} (A_{ij} + A_{ji})^2. $$