Let $A$ be a random $m$ by $n$ rectangular sign matrix, chosen uniformly at random, with $m < n$. To be clear, $A$ is a matrix whose entries are chosen from $\{1,-1\}$.
Let $B = A^T A$. We know, for example, that $B$ is a square and symmetric $n$ by $n$ matrix with all its diagonal entries equal to $m$ exactly. I am trying to learn how to calculate the expected Frobenius and spectral norm of $B$. We can assume both $m$ and $n$ are large if that helps give a good bound or estimate.
How can you calculate $\mathbb{E}(||B||_F)$ and $\mathbb{E}(||B||_2)$?
The expected Frobenius norm of $B$ is defined to be
$$ \mathbb{E}(||B||_F)=\mathbb{E}\left(\sqrt{\sum_{i=1}^n\sum_{j=1}^n |b_{ij}|^2}\right) $$
where $b_{ij}$ are the elements of $B$.
The expected spectral norm of $B$ is defined to be
$$ \mathbb{E}(||B||_2)= \mathbb{E}\left(\max_{|x|_2 \ne 0}\frac{|Bx|_2}{|x|_2}\right). $$
Cross-posted to https://mathoverflow.net/questions/222994/how-to-calculate-expected-value-of-matrix-norms-of-ata now. I will update this question if anything substantive is added there.
Here is a try. The matrix entries are Bernoulli random variables, that get a value of $\pm 1$. A Bernoulli random variable is zero mean subgaussian variable with finite moments.
If we define $\lambda = m/n$, then for large values of $m,n$:
$\Vert A \Vert_2 \rightarrow \sqrt{n}\left(1+\sqrt\lambda \right)$
and therefore, $\Vert B \Vert_2 \rightarrow n\left(1+\sqrt\lambda \right)^2$
Asymptotically, $\Vert A \Vert_2 = \sqrt m+\sqrt n$ and $\Vert B \Vert_2=(\sqrt m + \sqrt n)^2$.
Here is the reference I used: http://www-personal.umich.edu/~romanv/papers/rv-rectangular-matrices.pdf
Another good reference is "Smallest singular value of random matrices and geometry of random polytopes" by Litvak et. al.
I tried it in MATLAB and it seems to be working well:
A = sign(rand(1e5,10)-0.5); c = 10/1e5;
sqrt(1e5)*(1+sqrt(c))
ans =
319.3900
norm(A)
ans =
318.8748
Now, for the Frobenius norm. The goal is to compute for start $\mathbb{E}(\Vert B \Vert_F^2)$. Let's divide it for two cases:
1) The diagonal part: very simple, since the elements are always $m$ the total contribution of the diagonal is $nm^2$.
2) The off-diagonal part ($i \neq j$): note that $\mathbb{E}(b_{ij})=0$ for $i\neq j$, so using the facts the RVs are iid, we get: $\mathbb{E}(b_{ij}^2)=\mathrm{Var}(\sum_k a_{ik}a_{jk})=\sum_k \mathrm{Var}(a_{ik}a_{jk})=m$, (again, for $i \neq j$). All in all, $n(n-1)$ off-diagonal elements.
Summing it all gives: $\mathbb{E}(\Vert B\Vert_F^2)=nm^2+n(n-1)m=nm^2+n^2m-mn$.
Using Bernstein's inequality for Bernoulli random variables with probability $\frac{1}{2}$, for large matrix the variance decays exponentially, and therefore $\sqrt{\mathbb{E}(\Vert B\Vert_F^2)} \rightarrow \mathbb{E}(\Vert B\Vert_F)$