Consider an $n \times n$ matrix $X$ where entries
$$ X_{ij} = \begin{cases} C, & \text{w.p. } p\\ 0, & \text{w.p. } 1-p,\\ \end{cases} $$ where $p$ is very small.
I am interested in bounding the spectral norm $\|X\|$. The entries of $X_{ij}$ are sub-Gaussian with $\|X_{ij}\|_{\psi_2} = \frac{C}{\sqrt{\log(2/p)}}$, and as such, Theorem 4.4.5 of Vershynin gives
$$\|X\| \lesssim \|X_{ij}\|_{\psi_2}\sqrt{n}$$
with high probability. The definition of sub-Gaussian norm $\| \cdot \|_{\psi_2}$ I am using here is Definition 2.5.6 in Vershynin.
This is fine if $p=0.5$ or so, but in my case, $p$ is very small. And as such, this bound is not tight at all. I would intuitively expect that the spectral norm should scale as $\sqrt{pn}$ or something similar.
In my case, $X_{ij}$ is small because it is only large with very small probability. This is not captured by the sub-Gaussian norm, because all it cares about are the tails (which are sub-Gaussian for any bounded random variable).
There is an analogous issue in the scalar setting. The sub-gaussian random variables are exactly those variables that obey a Hoeffding's inequality (Theorem 2.2.2 in Vershynin). However, as he points out in Section 2.3, the Hoeffding inequality is useless for Bernoulli random variables with small $p$. Instead, you want to use the Chernoff inequality (Theorem 2.3.1) which is sensitive to small $p$.
Are there any bounds for $\|X\|$ when the entries are Bernoulli with small $p$?
it's a real shame that you never got an answer to this problem. I came across this problem in my own research, and did some basic simulations to empirically validate that the norm is approximately $\sqrt{p n}$. I actually spent quite a large amount of effort trying to prove this, but I wasn't able to. I think you're right in saying that the concepts of sub-gaussian norm aren't appropriate for this, as that only captures of tail densities and not sparsity.
Were you able to come up with anything sharper? I think there would be great interest in the math community in getting a sharper result.