Why do we have to standardize first?

51 Views Asked by At

I was working on this problem where we assumed $Y\sim G(\bar{y}=2.014,s=4.3047)$ and basically we had to find the probability of $1-P(-10<Y<10)$.

At first, I was simplifying it: $$\begin{align} 1-P(-10<Y<10)&=1-(P(Y<10)-P(Y<-10)) &&(1)\\ &=1-(P(Y<10)-1+P(Y<10))\\ &=2-2P(Y<10)&&(2)\end{align}$$

Then I would standardize. But to get the correct answer, I would have had to standardized after $(1)$

But am I not working with the same equation? Why does it make a difference if I standardize after equation $(1)$ rather than after equation $(2)$

1

There are 1 best solutions below

2
On BEST ANSWER

$P(-10 < Y < 10) = P(Y < 10) - P(Y < -10).$ You have a mistake in your second displayed equation. The issue is not the order of standardization.

If $Y \sim \mathsf{Norm}(2.041, 4.3047),$ then you can find $Q = P(-10 < Y < 10) = 0.9652$ in R without standardizing, so that $1 - Q = 0.0348:$

q = diff(pnorm(c(-10,10), 2.04, 4.3047)); q
[1] 0.9652019
1 - q
[1] 0.03479811

In R, pnorm is a normal CDF. Using printed normal tables often involves some rounding error, so an answer from tables might be slightly different.

Also, we have $P(-10 < Y < 10) = P(Y < 10) - P(Y < -10) = 0.9652$ as follows:

pnorm(10, 2.04, 4.3047) - pnorm(-10, 2.04, 4.3047)
[1] 0.9652019 

Finally, $P(Y < -10) = 0.0026 \ne P(Y > 10) = 0322,$ so the two 'tail probabilities' are not equal.

pnorm(-10, 2.04, 4.3047)
[1] 0.002579433
1 - pnorm(10, 2.04, 4.3047)
[1] 0.03221868

Here is a graph of the density function of $\mathsf{Norm}(2.04, 4.3047),$ with vertical red lines separating the areas of interest. The respective probabilities (left to right) are 0.0026, 0.9652, and 0.0322.

enter image description here

Note: I can't show you exactly how to get these answers from printed tables of the standard normal distribution because these tables are made in many different formats. Soon I think printed normal tables may suffer richly deserved obsolesence.