How do we proof that Gaussian Tails is interesting area

41 Views Asked by At

I have a programming problem that i solve using Gaussian Distribution. The problem is outlier detection. I use the uncertainty of the data, calculated from the classifier confidence, based on the uncertainty, i plot the histogram, the outlier mostly reside on the tail of the Gaussian distribution. programmatically i found that the outlier for my problem is on the tail of the Gaussian distribution i.e in the (-2 to -3) and (2-3) area and far from the mean (only few in the mean). How do i proof this mathematically?Thank you for your help.

1

There are 1 best solutions below

3
On

An indicator of whether a sample is at the tail or near the centre is it's $z$ value, calculated as follows

$$z(x) = \frac{x-\bar{x}}{\sigma}$$

Generally, for values of $z$ close to 0, the sample is near the mean, and for large positive or negative values ($> 2 , < -2$), the sample is far. Now what you'd call the tail entirely depends on you, your application, and what you would consider outlier data