Suppose a sample $\boldsymbol{x}_1,\dots, \boldsymbol{x}_N \in \mathbb{R}^d$ is given with sample matrix $\boldsymbol{X}\in\mathbb{R}^{N\times d}$ and covariance matrix $\boldsymbol{\Sigma}=\frac{1}{N-1}\boldsymbol{X}^T\boldsymbol{X}$. It is know, that for the trace of the matrix $\text{tr}(\boldsymbol{\Sigma})=\frac{1}{N-1}\sum_{j=1}^N ||\boldsymbol{x}_j||^2$, that means, $\text{tr}(\boldsymbol{\Sigma})$ can be calculated without calculating the covariance matrix or its eigenvalues.
My question: Can $\text{tr}(\boldsymbol{\Sigma}^2)$ be calculated (or good estimated) out of the data without calculating explicitly $\boldsymbol{\Sigma}$ or its eigenvalues?
$\text{tr}(\boldsymbol{\Sigma}^2)=\sum_{i=1}^d \lambda_i^2$ for the eigenvalues $\lambda_i$ of $\boldsymbol{\Sigma}$, so $\text{tr}(\boldsymbol{\Sigma})^2$ is obiously an upper bound.
I am grateful for any ideas.
Let $M$ be a matrix for which I have a magic oracle which tells me $Mx$ when presented with $x$. Let $\omega$ be a random vector which isotropic in the sense that $\mathbb{E}[\omega\omega^*] = I$. (For instance, $\omega$ can be a vector of IID standard normal random variables.) Then one can easily see that $\mathbb{E}[\omega^*(M\omega)] = \operatorname{tr}(M)$. (This follows from the cyclic property of the trace.) Thus, averaging the quantity $\omega^*(M\omega)$ for $k$ random instances of $\omega$ yields an unbiased estimate of the trace. Like other sampling-based approximations the root-mean-square error of this approximation with $k$ samples scales like $k^{-1/2}$, not great but not terrible if you only need an estimate.
In your case for $M = \Sigma^2$, the oracle proceeds by multiplying $x$ by $\Sigma$ twice. Each multiplication by $\Sigma$ is done as $\Sigma y = (N-1)^{-1} X^T (Xy)$, which takes $O(Nd)$ time.
This idea for trace estimation is very well-studied. See Chapter 4 of the following review for more analysis. There may be more sophisticated approaches for this particular problem, but this technique is a fully general technique which works for any $M$ with a matrix-vector multiplication oracle.