Weak LLN for U-Statistics

175 Views Asked by At

Question

Let $(X_n)_{n\geq 1}$ be an i.i.d sequence of random varaibles such that $EX_1=\mu$ and $\sigma^2=\text{Var}(X_1)<\infty$. Then $$ U_n=\binom{n}{2}^{-1}\sum_{1\leq i<j\leq n} X_iX_j\to\mu^2 $$ in probability as $n\to \infty$.

My attempt

I was able to prove this claim in the case that $\mu=0$. Indeed in that case we can use Chebeshev's inequality to deduce that $$ P(|U_n|>\varepsilon)\leq \frac{n(n-1)\sigma^4}{2\varepsilon^2\binom{n}{2}^2}\to 0 $$ as $n\to \infty$ for each $\varepsilon>0$.

Problem

I am having difficulty proving the claim when the random variables are not centered. I tried to applied Chebeshev in general, since $$ EU_n=\frac{2}{n(n-1)}\frac{n(n-1)}{2}\mu^2=2\mu^2 $$ but I am unable to compute the variance of $U_n$ fully. I know that $$ \begin{align} \text{Var}(U_n)&=\binom{n}{2}^{-2}\text{Cov}\left(\sum_{1\leq i<j\leq n }X_i X_j, \sum_{1\leq i<j\leq n} X_i X_j\right)\\ &=\binom{n}{2}^{-2}\left(\binom{n}{2}(2\mu^{2}\sigma^{2}+\sigma^4)+\dotsb\right). \end{align} $$ The term in the sum represents covariance of the form $\text{Cov}(X_iX_j, X_jX_i)$. All covariances of the form $$\text{Cov}(X_iX_j, X_kX_l)$$ with $i,j,k, l$ distinct will contribute zero. I am having trouble with counting and computing covariances of the form $$\text{Cov}(X_iX_j, X_kX_l)$$ where exactly one of $k$ or $l$ is $i$ or $j$. Any help is appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

For the expectation, it should be$$ EU_n=\frac{2}{n(n-1)}\frac{n(n-1)}{2}\mu^2=\mu^2 $$

We first compute

\begin{align} Cov(X_1X_2,X_1X_2) &=E(X_1^2X_2^2)-E(X_1X_2)^2\\ &=(\sigma^2+\mu^2)^2 - \mu^4\\ &= \sigma^4+2\sigma^2\mu^2 \end{align}

We also have

\begin{align} Cov(X_1X_2,X_2X_3) &=E(X_1X_2^2X_3)-E(X_1X_2)E(X_2X_3)\\ &=\mu^2(\sigma^2+\mu^2) - \mu^4\\ &= \sigma^2\mu^2 \end{align}

and

\begin{align} Cov(X_1X_2,X_3X_4) &=E(X_1X_2X_3X_4)-E(X_1X_2)E(X_3X_4)\\ &=0 \end{align}

Hence, we just need to compute those terms that involve overlapping index. For the first case, there are $\binom{n}{2}$ of them and for the second case, there are $2n \cdot \binom{n-1}2 $ of them.

$$Pr(|U_n-\mu^2|> \epsilon) \le \frac{\binom{n}2(\sigma^4+2\sigma^2\mu^2) + 2n \cdot \binom{n-1}2(\sigma^2\mu^2)}{\binom{n}{2}^2\epsilon^2}$$

Notice the magnitude of the denominator is of order $n^4$ and the magnitude of the numerator is of order $n^3$.

Hence $U_n$ converges in probability to $\mu^2$.

0
On

The different solution to the original problem suggests itself in addition to the brilliant answer from Siong Thye Goh. It does not use Chebyshev's inequality, but only the LLN and properties of convergence in probability.

LLN implies that $$ \left(\frac{X_1+\ldots+X_n}{n}\right)^2 \xrightarrow{p} \mu^2. $$ And $$ \left(\frac{X_1+\ldots+X_n}{n}\right)^2 = \frac{X_1^2+\ldots+X_n^2}{n^2} + U_n \frac{n-1}{n}. $$ The first summand in r.h.s. tends to zero in probability: $$ \frac{X_1^2+\ldots+X_n^2}{n^2} =\frac1n \cdot \frac{X_1^2+\ldots+X_n^2}{n} \xrightarrow{p} 0 \cdot (\mu^2+ \sigma^2) = 0. $$ Therefore, $$ U_n = \frac{n}{n-1}\left(\left(\frac{X_1+\ldots+X_n}{n}\right)^2 - \frac{X_1^2+\ldots+X_n^2}{n^2}\right)\xrightarrow{p} 1\cdot (\mu^2-0)=\mu^2. $$