Find $P(a < X < b)$ for a $N(μ, σ^2)$, why do we not integrate the PDF to solve the problem but instead use the $Z$-table?

313 Views Asked by At

I wanted to deeply understand what $PDF$ and $CDF$ meant. For a binomial distribution with $n$ trials and parameters $p$, we use the $PDF$ (for discrete case) to find $P(X = 3)$ where $X$ is a random variable representing the number of heads and there are $10$ trials.

For the continuous case, The $PDF$ is used to find $P[a\leq X\leq b] $ as $\int _{a}^{b}f_{X}(x)\,dx,$ where $f_{X}(x)$ is the PDF.

So I saw a problem, $X$ is a normally normally distributed variable with mean $μ$ = 30 and standard deviation $σ^2$ = 4. Find $P(30 < x < 35)$

I know how to use the $Z$-table, but why do we not find the answer using: $\int _{30}^{35}{\displaystyle {\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}}\,dx$? Or do we almost always use the $PDF$ approach to solve such problems and I've just never encountered it because $Z$-table approach makes it a lot easier? Or does this have something to do with the fact that integrating this particular $PD$F for Normal Distribution is hard? I don't have an extremely deep background in mathematics so I don't know.

Also I apologize for usage of any wrong notations and symbols.

4

There are 4 best solutions below

0
On BEST ANSWER

I know how to use the $Z$-table, but why do we not find the answer using: $\int _{30}^{35}{\displaystyle {\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}}\,dx$?

Well, have you ever tried finding the integral?   Look at it.   It is rather horrible, isn't it?   It turns out that:

$${\displaystyle \int _{30}^{35}{\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}}\,\mathsf dx =\dfrac 12\operatorname{erf}\left(\dfrac{5}{2\surd 2}\right) $$

But what the heck is this error function thing?   Well, it is defined as: $$\operatorname{erf}(x)=\dfrac{1}{\surd\pi}\int_{-x}^x e^{-t^2}\operatorname d x = \dfrac{1}{2\surd\pi}\sum_{n=0}^{\infty}\dfrac{x}{2n+1}\prod_{k=1}^{n}\dfrac{-z^{-2}}{k}$$

What is the closed form of this thing?   Well... it doesn't appear to have one.

So, the cummulative probability distribution function for a normal distribution is simply not expressable in terms of elementary functions.   It has to be evaluated through numerical means.

So we use precompiled tables (or software) to find the answer when we need it.

0
On

Yes, when dealing with normal distributions, it is much faster to convert it to a standard normal and use the $z$-table. This is the more common approach if you don't have access to software. The integration is very messy. There are many steps to reach the desired probability by integrating but software gives that

$$\int _{30}^{35}{\displaystyle {\frac {1}{\sqrt {2\pi \sigma ^{2}}}}e^{-{\frac {(x-\mu )^{2}}{2\sigma ^{2}}}}}\,dx=\frac{1}{2}erf\left(\frac{5}{2\sqrt{2}}\right)\approx0.4938$$

where $erf$ is the error function.

It's much easier to just note that

$$\begin{align*} P(30\lt X\lt35) &=\Phi\left(\frac{35-30}{2}\right)-\Phi\left(\frac{30-30}{2}\right)\\\\ &=\Phi(2.5)-\Phi(0)\\\\ &=0.9938-0.5\\\\ &=0.4938 \end{align*}$$

0
On

The Z-table is the integral, it's just that someone has gone and numerically calculated all the values rather than writing it in terms of functions.

The reason why we do that is because the integral is not just difficult, it cannot be written in terms of a finite number of elementary functions, so the best we can do is to just define a new function $\Phi(x)$ to represent the integral and work with that.

And since we can't write down the values for every possible normal distribution, we pick one - the standard normal distribution - for the tables of values, and note how you transform any other normal distribution to use the same table.

0
On

Short answer: integrating the PDF is "hard" (as you mentioned). While you have explicit formulas for integral of polynomials, basic trigonometric functions and so forth, for this PDF, there is no explicit formula that is "clean and simple". The best you have, as the answer, is the "error function" which is more of a notation rather than a formula you can type into a basic calculator.

The Z-table is a table whereby someone used numerical schemes to literally integrate it numerically. Indeed we probably would not use the table if we computed it in the first place!