Can I use mean and standard deviation to spot outliers?

Question

Can I use mean and standard deviation to spot outliers?

984 Views Asked by Bumbble Comm At 31 Mar 2026 - 11:35

I have a list of measured numbers (e. g. lengths of products). Of these I can easily compute the mean and the standard deviation.

Now, when a new measured number arrives, I'd like to tell the probability that this number is of this list or that this number is an outlier which does not belong to this list. Is this statement possible given only mean and stddev?

Can I compute the probability with which this new value is part of the list? I'd like to have a probability as a result.

Original Q&A

There are 3 best solutions below

Bumbble Comm On 27 Jun 2016 - 1:22

Yes. You can use your Standard Deviation to tell you this. Think about what Standard Deviation is telling you.

Bumbble Comm On 27 Jun 2016 - 5:37

It is best to use a boxplot to find outliers. The problem with using the sample mean $\bar X$ and the sample SD $S$ is that an outlier seriously affects the values of $\bar X$ and $S$.

By contrast, the boxplot uses the median and the interquartile range to detect outliers. These measures of location and dispersion, respectively, are not much affected by outliers.

If you feel you must use $\bar X$ and $S$, then here is how to test observations one at a time for outliers: Omit the suspected outlier. Find $\bar X^*$ and $S^*$ from the remaining $n - 1$ observations. Then see if the omitted point is in some interval such as $(\bar X* - 2.5S^*, \bar X* - 2.5S^*)$. If so, the suspected observation is not judged an outlier. If outside the interval, then consider it an outlier. The disadvantage of this method is that you have to recompute $\bar X^*$ and $S^*$ afresh for each suspected outlier.

**Bumbble Comm** · Accepted Answer

Absolutely. It is a known fact that for a sufficiently long list , (denoting mean by $\mu$ and standard deviation by $\sigma$) the range $[\mu-3\sigma,\mu+3\sigma]$ encompasses about (more than) $99.73\%$ of the data points, so if the new value is out of this range then it is $99.7\%$sure to be out of the list

You can somewhat use the concept of $p-value$ here. (Assuming the new value to follow gaussian distribution,since we don't know) ; find out the value of $\Phi(x)$--(CDF of $N(\mu,\sigma^2)|_{x=\text{new value}}$) Its $p-value=1-\Phi(x).$ If $p-value\lt $ some confidence level(say 0.05) then you can consider it within the list else not.

Can I use mean and standard deviation to spot outliers?

There are 3 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in STANDARD-DEVIATION

Trending Questions

Popular # Hahtags

Popular Questions