Mean of a portion of a normal distribution?

2.8k Views Asked by At

How do I calculate the mean of a portion of a normal distribution. In other words, say I have a normal distribution of the heights of adult males. The mean is 70" and the standard deviation is 4". What is the average height of males above the 95th percentile? What is the average height of all males below the 95th percentile? How do I calculate this?

Someone asked a similar question here which was never answered clearly:

Mean value from part of normal distribution

This question has a practical application for my work where I am running a power plant at 10.3 MW mean operating point with a standard deviation of 0.2 MW. I would like to know the average power when I am above 10 MW. Or the mean of all points above 10 MW.

4

There are 4 best solutions below

1
On

I think you just need to take the values/points (which are above 10MW), sum them up and divide the sum by their count. That's all, no? Whether the distribution is normal or not, that's not relevant here, I think.

0
On

The "mean" of a continuous probability distribution, P(x), is, by definition, the integral of xP(x). To restrict a normal distribution, $y= Ae^{\frac{(x- \mu)^2}{\sigma^2}}$, between x= a and x= b, with a< b, we have to divide by the probability x is between a and b, the integral of P(x) between a and b. Here, mean is $\frac{\int_a^b xe^{\frac{(x- \mu)^2}{\sigma^2}}dx}{\int_a^b e^\frac{(x- \mu)^2}{\sigma^2}dx}$

15
On

Suppose for simplicity that you have a standard normal $X$ with pdf $f$. One of the main properties of $f$ is that it satisfies $f'=-xf$ which implies $\int_a^b xf\,dx=-\int_a^b df=f(a)-f(b)$.

It follows that $$E(X|X\in[a,b])=\frac {f(a)-f(b)}{\Phi(b)-\Phi(a)}$$

Set $b=\infty$ ($\Phi(\infty)=1, f(\infty)=0$) if you want a one-sided bound.

1
On

Suppose $X$ is normal with mean $\mu$ and standard deviation $\sigma$. Then $Z=\frac{X-\mu}{\sigma}$ is normal with mean $0$ and standard deviation $1$, and $X=\sigma Z + \mu$. Then

$$E[X \mid X \in [a,b]]=E[\sigma Z + \mu \mid X \in [a,b]] \\ = \mu + \sigma E[Z \mid X \in [a,b]] \\ = \mu + \sigma E \left [Z \left | Z \in \left [ \frac{a-\mu}{\sigma},\frac{b-\mu}{\sigma} \right ] \right. \right ] \\ = \mu + \sigma \frac{E \left [ Z 1_{[\frac{a-\mu}{\sigma},\frac{b-\mu}{\sigma}]}\right ]}{P \left ( Z \in \left [\frac{a-\mu}{\sigma},\frac{b-\mu}{\sigma} \right ] \right )} \\ = \mu + \sigma \frac{\int_{\frac{a-\mu}{\sigma}}^{\frac{b-\mu}{\sigma}} x e^{-\frac{x^2}{2}} dx}{\int_{\frac{a-\mu}{\sigma}}^{\frac{b-\mu}{\sigma}} e^{-\frac{x^2}{2}} dx}.$$

The numerator can be calculated in terms of elementary functions using the FTC, since $-x e^{-x^2/2}$ is the derivative of $e^{-x^2/2}$.

The denominator is an integral of the standard normal density, which cannot be calculated in terms of elementary functions, but can be easily evaluated using software, using functions generally called either "erf" (for "error function") or "normcdf".

$$= \mu + \sigma \frac{ e^{-\frac{\left ( a-\mu\right ) ^2}{2\sigma^2}} - e^{-\frac{\left ( b-\mu\right ) ^2}{2\sigma^2}} }{ \sqrt{\frac{\pi}{2}}\left ( \text{erf}\left ( \frac{b-\mu}{\sqrt{2}\sigma}\right ) - \text{erf}\left ( \frac{a-\mu}{\sqrt{2}\sigma}\right ) \right ) }$$