Normal Probability Distribution compared to Normal Cumulative Probability Distribution

77 Views Asked by At

This is likely a duplicate, but can't find it on MSE.

Let's say I have a normally distributed population with $\mu=2.75$ and $\sigma=0.25$. If $x$ is a value in the population of interest, using the normal probability distribution function I find that

$$P(x=3)\approx 0.96788,$$

and using the normal cumulative probability distribution I find that

$$P(x\le3)\approx 0.84134.$$

At first glance, it seems counterintuitive that a value being equal to 3 is more likely to be chosen from the population than a value being less than or equal to 3.

I do not have a strong statistics background, but I understand that the normal probability density curve is continuous (while the population must be finite and hence discrete), and I suspect this may be the issue with this seeming paradox.

Now my question is:

Can someone give an elementary explanation for why this occurs using simple (practical) language?

2

There are 2 best solutions below

1
On

It is unlikely but could it be the probability of being within 3 standard deviations of the mean ? If you computed statistics you generated using few samples, outliers may have more weight than they should ie 1-96.8% instead of 1-99.7%.

How did you find 96.788% ?

1
On

I will assume upvotes validate the accuracy of the following answer:

The first value is not a probability, but a density value, $D_p$ given by

$$D_p=\frac1{\sigma\sqrt{2\pi}}e^\frac{-(x-\mu)^2}{2\sigma^2}.$$

The density for a value $x=a$ should be used to compare whether a value $x_i$ is more likely to be chosen from the population if it is near (within a specified $\varepsilon$) of $a$ as opposed to some other value $b\ne a$.

In this sense, the probability is defined as an accumulation of these densities. Since the densities are distributed on a continuous curve, the accumualtion is done by an integral so that $$P(x_i\le a)=\int_{-\infty}^aD_p\,dx.$$

So for this question, a value near $x=3$ is more likely to be chosen from the population than say $x=2$, and values near the mean value of $x=2.75$ are most likely to be chosen (and this due to symmetry).

It should also be noted, and related to this question, that in the event the normal distribution is used to approximate a discrete distribution, then as the sample size $n\to\infty$, a value $x_i$ near the mean remains most likely to be chosen at random from the population, while $P(x_i)\to 0$ for all such $x_i$ in the population.