Extracting probabilities from a normal distribution

117 Views Asked by At

I have plotted a PDF normal distribution function for 8000 data points on mathematica using

$Plot[PDF[NormalDistribution[64, 8.5333333], x], \{x, 20, 100\}]$.

Normal distribution of 8000 data points

I want to find a range of values of $A_i$ of around 5MHz with a high probability of containing just one data point. I.e. if I look in a 5MHz section of the distribution, I should only find one data point in this particular section. I would like this to have a high value of $A_i$. I realise that I need to look at the edge of the distribution, but is this done by integrating the area under the curve, and if so how does this work?

1

There are 1 best solutions below

0
On BEST ANSWER

Let $N$ be the total number of data points and $P$ the PDF. The probability that the interval $[A_1,A_2]$ contains precisely $n$ data points is $$\left(\begin{matrix}N \\ n \end{matrix}\right)\left(\int_{A_1}^{A_2} P(a) da\right)^n \left(1-\int_{A_1}^{A_2} P(a) da\right)^{N-n}$$ This is a binomial distribution.

How high can we make the probability that the interval contains precisely 1 data point? $$\left(\begin{matrix}N \\ 1 \end{matrix}\right)\left(\int_{A_1}^{A_2} P(a) da\right) \left(1-\int_{A_1}^{A_2} P(a) da\right)^{N-1}$$ $$N\left(\int_{A_1}^{A_2} P(a) da\right) \left(1-\int_{A_1}^{A_2} P(a) da\right)^{N-1}$$ Let us call $\int_{A_1}^{A_2} P(a) da=p$ $$Np \left(1-p\right)^{N-1}$$ To maximize, $$\frac{d}{dp}Np \left(1-p\right)^{N-1}=0$$ $$-N (1 - p)^{N-2} (N p-1)=0$$ $$N p=1$$ You maximize the probability that the interval contains precisely 1 data point if $$N\int_{A_1}^{A_2} P(a) da=1$$ in which case the probability that the interval contains precisely 1 data point is $$\left(1-1/N\right)^{N-1}$$ For large $N$, this approaches $1/e$.

How to pick $A_1,A_2$ such that $\int_{A_1}^{A_2} P(a) da=1/N$? This you will have to do numerically. You can choose an interval symmetric around 5MHz and solve $$\int_{5\text{MHz}-x}^{5\text{MHz}+x} P(a) da=1/N$$