Tolerance limit (interval) in python

1.2k Views Asked by At

I am trying to calculate the tolerance limit (interval) in python. As a reminder, the tolerance limits are:

$$ \bar{x} \pm k s $$

where $\bar{x}$ is the sample mean and $s$ is the sample standard deviation. The book I am reading uses a table to get $k$, but I would like to calculate $k$ myself in python. I found the formula for $k$ on page 12 in this document which suggests:

enter image description here

I am familiar with scipy.stats package but I cannot figure out how to get $k$ - I've tried a few permutations. For example:

from scipy import stats as s
import numpy as np

n = 30
p = 0.995
z = s.norm.ppf((1-p)/2)

g = 0.99
c = s.chi2.ppf(g, df=n-1)
k = z * np.sqrt(((n-1)*(1 + 1/n))/c * (1 + (n-3 - c)/(2*(n+1)**2)))
print(k)

The statistical table for $k$ from the book is:

enter image description here

1

There are 1 best solutions below

7
On

The issue is that the quantiles are not being calculated correctly. This isn't your fault: the presentation is not clear about this, and fails to provide a numeric example of the formula against which you can confirm the calculation.

As used in the presentation, the quantity $z_{(1-p)/2}$ refers to a number such that $$\Pr[Z > z_{(1-p)/2}] = \frac{1-p}{2}.$$ Note the direction of the inequality: the author is using an upper quantile, rather than a lower one. In Python, norm.ppf calculates $\Pr[Z \le z_{(1-p)/2}]$. So to compensate, you can simply write z = -s.norm.ppf((1-p)/2) or equivalently, z = s.norm.ppf((1+p)/2).

The other problem you are having is that you are using $g = 0.99$ when the formula is expecting $g = 0.01$. In other words, you should either write c = s.chi2.ppf(1-g, df=n-1), or you should define g = 0.01. Again, this is the author's sloppiness and is essentially the same problem as above: not specifying how the quantile is to be calculated.

Here is my calculation for $n = 30$, $p = 0.995$, $g = 0.99$ (using your inputs). I get $$z = 2.80703, \quad c = 14.2565, \quad k = 4.08316.$$