Proof of Central Limit Theorem via Fourier Transform: Case of non-zero mean and non-unit variance

1.8k Views Asked by At

Apologize in advance for use of informal language and non-rigorous presentation.

I don't have formal background in Probability Theory, but am exposed to Fourier Transforms (FT). When I learned that PDF of sum of two IID random variables (rvs) is convolution of the their individual PDFs, it gave me the idea to prove CLT through use of FT by converting repeated convolution in time-domain to product of Fourier Transforms in the frequency domain. I've since learned that the idea is (of course) not new. There is also a device called "characteristic functions" which is essentially a FT of PDFs within a sign flip.

Nevertheless, I want to fill the gaps in the proof, but find myself stuck. I found an informal proof on page 116 in these course notes which is close to the approach I was following. However, the proof in the course notes assumes standardized rvs (i.e., zero mean and unit variance) which eliminates some tricky terms (see below). I dont want to standardize my rvs and prove it explicitly for any mean and variance.

The informally stated CLT I'm working with is as follows: If $X_i$ are IID rvs with mean $\mu$ and variance $\sigma^2$ and $$S_n = \frac{1}{n}\sum X_i$$ then, as $n\to\infty$, PDF of $S_n$ tends to a Gaussian with a mean $\mu$ and variance $\sigma^2/n$: $$P(S_n) \to \frac{1}{\sqrt{2\pi\sigma^2}}e^{-(x-\mu)^2/\sigma^2}$$

I realize I may be bit off in factors (e.g., $2\sigma^2$ instead of $\sigma^2$), but can adjust for these later. So following are my steps and as far I could get.

Define $p(x)$ as the PDF of $X$, $P(X\leq x) = \int_{-\infty}^x p(t) dt$ as the CDF of $X$ and $F_p(\omega) = \int_{-\infty}^\infty e^{-i\omega t} p(t) dt$ as the Fourier transform of the PDF of $X$.

Then for any scalar $\alpha$, the PDF of $X/\alpha$ is $\alpha p(\alpha x)$. In particular, for $\alpha = n$, PDF of $\frac{X}{n}$ is $n p(nx)$. Going to Fourier domain, if FT of $p(x)$ is $F(\omega)$, then using the scaling property of Fourier Transform, FT of $n p(nx)$ is $F(\omega/n)$. Before going further lets simplify $F(\omega/n)$: $$ \begin{align} F(\omega/n) &= \int_{-\infty}^{\infty}e^{-i\omega t/n}p(t) dt\\ &= \int_{-\infty}^{\infty}(1 - \frac{i\omega t}{n} - \frac{\omega^2t^2}{2n^2} + \cdots) p(t) dt\\ &= \int_{-\infty}^{\infty} p(t) dt - \frac{i\omega}{n}\int_{-\infty}^{\infty} tp(t)dt - \frac{\omega^2}{2n^2}\int_{-\infty}^{\infty}t^2p(t) dt\\ &= 1 - \frac{i\omega\mu}{n} - \frac{\omega^2}{2n^2}(\sigma^2 + \mu^2) \end{align} $$ In the above, we used normalization condition (first term), $E[t]=\int t p(t) dt = \mu$ (second term) and $E[t^2] = \int t^2 p(t)dt = \sigma^2 + \mu^2$ (third term).

Now $S_n = \sum_{i=0}^n \frac{X_i}{n}$, which is sum of $n$ IID rvs. So it its PDF is $np(nx)$ convoluted with itself $n$ times. In Fourier domain, this is just $[F(\omega/n)]^n$. But I'm having difficulty taking the limit as $n\to\infty$ because of $n$ and $n^2$ terms in the denominator.

Standardizing to zero mean, unit variance and showing $W_n = \sum \frac{X_i - \mu}{\sigma\sqrt{n}} \to N(0, 1)$ seems to work as demonstrated in the course notes referenced above. However, not choosing to standardize the variables and trying to show $S_n = \frac{1}{n}\sum X_i \to N(\mu, \sigma^2/n)$ does not seem to work because of heterogenous powers of $n$ in the approximated Fourier Transform equation. Also, there is an inherent difficulty in "retaining" $n$ in the final answer when were are taking a limit as $n\to \infty$. However, the CLT version I'm working with is quite common and I have seen it used often in hypothesis testing.

I'd highly appreciate if I were pointed to a conceptual flaw or an error in my steps. Thanks.

1

There are 1 best solutions below

0
On BEST ANSWER

I think your deduction is correct until last step. The purpose of the problem is verifying how close $[F(\omega/n)]^{n}$ to Fourier transform of $N(\mu,\frac{\delta}{\sqrt{n}})$ rather proving their equality when $n$ becomes large. They should be asymptotic and never identity. So a intuitive inference is:

  • compute the FT of $N(\mu,\frac{\delta}{\sqrt{n}})$, the result is $$e^{-\frac{\delta ^2 \omega^{2}}{2 n }-{i\mu\omega}}$$.
  • and furthermore, its n times root is $$e^{-\frac{\delta ^2 \omega^{2}}{2 n^{2} }-{\frac{i\mu\omega}{n}}}$$
  • the Taylor series is $$1-\frac{\delta ^2 \omega^{2}}{2 n^{2} }-{\frac{i\mu\omega}{n}}-\frac{\mu ^2 w^2}{2n^2}+\frac{i \delta ^2 \mu w^3}{2n^3}+\frac{\delta ^4 w^4}{8 n^4}+\dotsm$$
  • We can immediately find the first four term of the series are the same with the $F(\omega/n)$. Their respective third order terms are $\frac{i \omega ^3}{6n^3}(\mu^3+3\mu\delta^2)$[$\mu^3+3\mu\delta^2$ is the third moment of $N(\mu,\delta^2)$] and $\frac{i \omega ^3}{6n^3}\int_{-\infty}^{\infty}t^3p(t)dt$. When $n$ tends to be large, the coefficients $\frac{\mu^3+3\mu\delta^2}{6n^3}$ and $\frac{\int_{-\infty}^{\infty}t^3p(t)dt}{6n^3}$ would become small to reduce the numerical difference. Not to mention higher order terms. A more strict deduction is that $$F(\omega/n)=1-\frac{i\mu\omega}{n}-\frac{\omega^2}{2n^2}(\mu^2+\delta^2)+O(\frac{1}{n^3})$$ and $$[F(\omega/n)]^n \approx (1-\frac{i\mu\omega+\frac{\omega^2\delta^2}{2n}}{n})^n=e^{-i\mu\omega-\frac{\omega^2\delta^2}{2n}}$$ when n $\rightarrow\infty$. So we conclude roughly that :

    1. the prerequisite of IID CTL is $F(\omega/n)$ exist i.e. moment of the $p(x)$ should to be finite. So Cauchy distribution is a exception.
    2. if the $p(x)$ greatly deviates from Guass distribution, then n need to be larger to reduce the greater discrepancy of third/higher order moment($\frac{\mu^3+3\mu\delta^2}{6n^3}$ and $\frac{\int_{-\infty}^{\infty}t^3p(t)dt}{6n^3}\dotsm$). For example, from the section 5.11 of book "Bayesian Logical Data Analysis for the Physical Sciences", suppose $p(x)$ is $U(0,1)$, then $\mu=0.5$,and $\delta^2=\frac{1}{12}$. Its 3rd moment is $0.25$, and corresponding $N(\mu,\delta)$ 3rd moment is also $0.25$. Then 4th moments are $0.2$ and $0.208333$, 5th are $\frac{1}{6}$ and $0.1875$, 6th are $\frac{1}{7}$ and $0.143$ $\dotsm$. When $n=4$, the result seems to be a Gauss distribution.