For a matrix $A\in\mathbb{R}^{m\times n}$ with $A_{ij}\sim \mathcal{N}(0,1)$ i.i.d., the spectral norm is bounded by $C(\sqrt{n} + \sqrt{m})$ with high probability where $C$ is a absolute constant. This is a known result.
My question is, given a diagonal matrix $B \in\mathbb{R}^{m\times m}$ where $B_{i,i} \sim \mathcal{N}(0,1)$ i.i.d., what's the spectral norm of the product $A^T B A$.
If $B$ is an identity matrix, the spectral norm of $A^TA$ should be bounded by $C(n+m)$ with high probability. But if there is an extra random matrix $B$, I feel the spectral norm would be $\sqrt{n}\sqrt{m}$ something like this.( this would make a difference if $m>>n$). For example, if $n=1$, the norm of $A^TBA$ is bounded by $O(\sqrt{m})$ with high probability.
Can anyone provide me some intuition how to analyze this norm? Thanks!