Setting a limit for the continuous normal variable in normal distribution

1.1k Views Asked by At

In an example, where a test has a maximum score of $200$, and a minimum score of $0$, can one eliminate the infinite boundaries?

Let's say my $\mu$ is $100$, and my $\sigma$ is $50$. Integrating the normal function for $200 \leq X$, I get:

$$\int^\infty_{200} \frac{e^{-\frac{(x-100)^2}{2\times50^2}}}{50\sqrt{2\pi}}dx \approx 0.022$$

There is a 2.2% chance that one will score over the limit of 200. Is there any way to make scoring over 200 impossible, and changing the maximum bound from $\infty$ to 200, and $-\infty$ to 0 in a way, that the integral below is true:

$$\int^{200}_{0} \frac{e^{-\frac{(x-100)^2}{2\times50^2}}}{50\sqrt{2\pi}}dx = 1$$

1

There are 1 best solutions below

0
On BEST ANSWER

You are close to the right idea, but I think you have to take into account that you are truncating both tails of the normal distribution.

If $X \sim Norm(100, 50),$ then $P(0 < X \le 200) = .9545.$

diff(pnorm(c(0,200), 100, 50))
## 0.9544997
C = 1 / diff(pnorm(c(0,200), 100, 50));  C
## 1.047669

Denote the density function of $X$ as $f_X.$ Then the density function for the desired truncated normal distribution with support $(0,200)$ is $f_Y(\cdot) = 1.0477f_X(\cdot) = C\,f_X(\cdot).$ This assures that $\int_0^{200} f_Y(y)\,dy = 1,$ as required.

For example, to find $P(50 < Y \le 150),$ you can find the product $C\cdot P(50 < X \le 150) = 0.7152.$

C*diff(pnorm(c(50,150), 100, 50))
## 0.7152328

Note: In cases where $C$ is very nearly $1,$ it customary to ignore the adjustment. For example, one often says something like 'the heights of 20 year old US men are normal with mean 69 inches and standard deviation 3.5 inches.' Logically, that normal distribution must be truncated at 0, because there can be no negative heights. In reality, the distribution is truncated both above and below. (Would you believe a man 2 feet tall? 15?) But everyone understands the normal model is only approximate, and no one worries much if the truncation points are more than 3 or 4 standard deviations away from the mean.

You can look at the Wikipedia article on 'truncated normal distribution', but it may be much more detailed and technical than what you need.

In the plot below, the dotted red curve is the density of $Norm(100, 50);$ the solid blue curve shows the 'inflation' by $C$ that makes the truncated PDF integrate to unity over $(0, 200).$

enter image description here