Finding the probability that a sum of squared z-scores is less than a certain value

522 Views Asked by At

Let $X_i, i = 1,...,10$ denoted a random sample of size $n = 10$, from a distribution which is $N(\mu, \sigma^2)$. Find:

  1. $P[\sum_{i=1}^{10} \frac{(X_i-\mu)^2}{\sigma^2} < 18.31]$
  2. $P[\sum_{i=1}^{10} \frac{(X_i-\bar{X})^2}{\sigma^2} < 18.31]$

I'm not sure exactly how to approach this problem as no other information is given. I can see that the top probability is a summation of squared z-scores but I really do not know if that helps me at all.

1

There are 1 best solutions below

0
On

In (1), clearly $Z_i = \frac{X_i -\mu}{\sigma} \sim \mathsf{Norm}(0, 1).$ Then you can use integration or moment generating functions (MGFs) to show that $Q = Z_i^2 \sim \mathsf{Chisq}(df = 1).$ An argument with MGFs also shows that $\sum_{i=1}^{10}Z_i^2 \sim \mathsf{Chisq}(df=10).$ (Both arguments are straightforward and you can do them whether or not your professor has already mentioned this use of chi-squared distributions.) Thus, using printed chi-squared tables or software you can find that $P(Q <18.31)=.95,$ as @Karl has Commented.

pchisq(18.31, 10)
## 0.9500458

For (2), there is a standard theorem that for a random sample $X_1, X_2, \dots, X_n$ from $\mathsf{Norm}(\mu, \sigma),$ the sample variance $S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar X)^2$ has $\frac{(n-1)S^2}{\sigma^2} \sim \mathsf{Chisq}(df= n-1).$ Because you have not given clues about the context of this problem, I don't know if you are expected to use this result to solve (2). In any case, the degrees of freedom in (2) must be $n - 1.$


Simulation note: The following simulation in R statistical software illustrates (2) with $\mu = 100, \sigma = 15,$ and $Q = (n-1)S^2/\sigma^2 = \sum_{i=1}^{10}\frac{(X_i - \bar X)^2}{\sigma^2}.$ The histogram of $Q$'s from a million samples of size $n=10$ is well matched by the density of $\mathsf{Chisq}(9)$ (solid purple curve), but not by the density of $\mathsf{Chisq}(10)$ (dotted brown curve).

set.seed(422);  m = 10^5;  n = 10;  mu = 100;  sg = 15
v = replicate(m, var(rnorm(n, mu, sg)))

q = (n-1)*v/sg^2;  mean(q);  sd(q)
## 9.009462  # aprx E(Q) = 9
## 4.252015  # aprx SD(Q) = sqrt(18) = 4.2426

mean(q < 18.31);  pchisq(18.31, 9)
## 0.96788    # aprx P(Q < 18.31)
## 0.9682574  # exact P(Q < 18.31)

mh = "Simulated Values of Q with CHISQ(9) Density"
hist(q, prob=T, br=20, col="skyblue2", main = mh)
curve(dchisq(x, 9), add=T, lwd=2, col="purple")
curve(dchisq(x, 10), add=T, lwd=2, col="brown", lty="dotted")
abline(v = 18.31, lwd=2, col="red", lty="dashed")

enter image description here