Approximating using normal

115 Views Asked by At

Assume all $X_i$'s are independent and follow the same distribution with mean $u$ and variance $v$. If there are $n$ $x_i$'s, where $n$ is large, the distribution of $$Y= X_1+X_2+\dots+X_n = nX$$ is approx. normal. The mean of the distribution of $Y$ will be $n\cdot u$ and $\mathrm{var}=n\cdot v$.

What confuses me is that since $\operatorname{var}(nx)=n^2\cdot\operatorname{var}(x)$ why don't we follow this rule in this approximation case and instead only multiply it by $n$?

3

There are 3 best solutions below

0
On BEST ANSWER

\begin{align} \operatorname{var}(X_1+\cdots +X_n) = {} & \operatorname{var}(X_1)+\cdots + \operatorname{var} (X_n) \\ & \text{if } X_1,\ldots,X_n \text{ are independent,} \\[10pt] = {} & n\operatorname{var}(X_1) \\ & \text{if all of the variances are equal.} \\[10pt] \operatorname{var}(X_1+\cdots+X_1) = {} & \operatorname{var}(nX_1) \\[10pt] = {} & n^2 \operatorname{var}(X_1). \\ & \text{In this cases } X_1,\ldots, X_n \text{ are not only} \\ & \text{not independent, but are all the same.} \end{align}

4
On

We do not have something like $X_1+\dots+X_n=nX$ (with $X$ as random variable that has the same distribution as the $X_i$) as you seem to suggest.

LHS is summation of independent random variables.

Then $$\operatorname{\mathsf{Var}}(X_1+\cdots+X_n)=\operatorname{\mathsf{Var}}X_1+\cdots+\operatorname{\mathsf{Var}} X_n = n\operatorname{\mathsf{Var}}X_1$$

The first equality on base of the fact that the $X_i$ independent.

The second equality on base of the fact that the $X_i$ have equal distribution, hence have equal variance.

RHS can be looked at as summation of random variable that are dependent in optima forma: $X_1=\cdots=X_n=X$.

Then $$\operatorname{\mathsf{Var}}(X_1+\cdots+X_n)=\operatorname{\mathsf{Var}}(nX_1) = n^2\operatorname{\mathsf{Var}}X_1$$

5
On

You're a little confused here; $nX_1$ has variance $n^2\operatorname{Var}X_1$, but $\sum_i X_i$ has variance $n\operatorname{Var}X_1$. Why? Because the covariance $\operatorname{Cov}(X,\,Y)=\overline{(X-\overline{X})(Y-\overline{Y})}$ is linear in each argument, and $\operatorname{Var}Y=\operatorname{Cov}(Y,\,Y)$. So $$\operatorname{Var} nX_1 = \operatorname{Cov}(nX_1,\,nX_1)=n^2 \operatorname{Cov}(X_1,\,X_1)=n^2\operatorname{Var} X_1,$$while if the $X_i$ are pairwise uncorrelated we have $$\operatorname{Var} \sum_i X_1=\operatorname{Cov} (\sum_i X_1,\,\sum_j X_j)=\sum_i\operatorname{Cov}(X_i,\,X_i)=\sum_i\operatorname{Var}X_i=n\operatorname{Var}X_1.$$