Symmetrization and the CLT

62 Views Asked by At

Let $(X_n)$ be an independent sequence of real standard normal random variables and let $(\boldsymbol{\Sigma}_n)$ be a sequence of $n \times n$ (growing size) real positive definite matrices. Define the triangular array $(Y_{i,n})_{1\leq i\leq n}$ as follows:

$$\boldsymbol{Y}_n:=\boldsymbol\Sigma_n^{1/2}\boldsymbol{X}_n$$

where $\boldsymbol{Y}_n=(Y_{1,n},\dots,Y_{n,n})^\top$ and $\boldsymbol{X}_n=(X_{1},\dots,X_{n})^\top$. Let $\boldsymbol \Sigma_n=\boldsymbol Q_n^\top\boldsymbol \Lambda_n \boldsymbol Q_n$ be an eigendecomposition of $\boldsymbol \Sigma_n$. Then

$$\boldsymbol{Y}_n^\top \boldsymbol{Y}_n=\boldsymbol{Z}_n^\top \boldsymbol \Lambda_n \boldsymbol{Z}_n$$

where $\boldsymbol{Z}_n:=\boldsymbol Q_n \boldsymbol{X}_n$ is a standard normal vector of uncorrelated, hence independent, random variables. I want to apply the CLT for triangular arrays to the normalized sum

$$\xi_n:=\frac{1}{s_n}\sum_{i=1}^n (Y^2_{i,n}-\sigma_{ii,n})=\frac{1}{s_n}\sum_{i=1}^n \lambda_{i,n}(Z_{i,n}^2-1)$$

where $s^2_n=Var(\sum_{i=1}^n Y^2_{i,n})=Var(\sum_{i=1}^n \lambda_{i,n}Z_{i,n}^2)=2\sum_{i=1}^n \lambda^2_{i,n}$. The Lyapounov's condition for $\delta=2$ is

$$\frac{E[(Z_{i,n}^2-1)^4]}{4} \frac{\sum_{i=1}^n \lambda^4_{i,n}}{(\sum_{i=1}^n \lambda^2_{i,n})^2}=15 \frac{Tr(\boldsymbol\Sigma_n^4)}{Tr(\boldsymbol\Sigma_n^2)^2} \to 0 \quad \text{as} \quad n\to \infty $$

The specific example am considering is with the tridiagonal matrix

$$\boldsymbol\Sigma_n = \begin{bmatrix} 1 & 1/2 \\ 1/2 & 1 & 1/2 \\ & 1/2 & 1 & \ddots \\ & & \ddots & \ddots & \ddots \\ & & & \ddots & 1 & 1/2 \\ & & & & 1/2 & 1 & 1/2 \\ & & & & & 1/2 & 1 \end{bmatrix}$$

whose eigenvalues are given by $\lambda_{i,n}=1+\cos(\frac{i\pi}{n+1})$, $i=1,\dots,n$. One checks that the Lyapounov's condition above is satisfied in this case.

Simulations with $n=200$ reveal that the tails of the distribution of $\xi_n$ are not particularly well approximated by the normal distribution. On the other hand the tails of the symmetrized statistic $\xi_n-\xi'_n$, where $\xi'_n$ is an independent copy of $\xi_n$, seem to be much better fitted by the normal distribution. Are there any theoretical reasons for this?

Thanks a lot for your help.