test a distribution with two bins: one bin is at most x% of the other bin

78 Views Asked by At

Our industrial process spits out items of normal and small size. At most $x\%$ of the items are allowed to be small.

I would like to validate the output of the proces by taking a sample of the output and run some kind of a test.

I guess I have two bins (normal and small) but I have no clue on how to proceed. I don't think I can do a chi-square test since the cumulative distribution is not known (due to the 'at most $x\%$' part). Some hints/pointers to get me started would be appreciated.

Note: the problem can be simplified (if needed/useful/easier to get me started/...) by assuming that the small bin is exactly $x\%$

Thanks in advance!

EDIT

I assume the problem can be reformulated as

  • suppose we have a population with red (too large items), green (normal size) and blue (too small size) balls
  • we take a sample ($N$ balls) and compute the sample percentage $P$ of blue (or red) balls
  • I assume we can say something like 'the population percentage of blue balls lies in the interval [$P-\delta$,$P+\delta$] with an accuracy of $95\%$' but what is the relation between $\delta$, the sample size $N$ and the $95\%$ ?
  • can we also say something like 'the population percentage of blue balls is lower than $X$ with an accuracy of $95\%$'

Some pointers/formulas to get me started would be appreciated.

1

There are 1 best solutions below

4
On

There are different ways to think about your problem. Here is one.

Suppose your process produces correctly sized items with probability $p$ and incorrectly sized items with probability $1-p$. You want a $w$-confidence interval for $p$, given sample fraction of correctly sized items $\hat{p}$ out of sample of size $n$. Then it is $\hat{p}\pm z\sqrt{\frac{1}{n}\hat{p}(1-\hat{p})}$, where $z$ is $1-\frac{1}{2}(1-w)$ quantile of standard normal.

This confidence interval is based on approximation of the binomial distribution by the normal one. As you might expect, the approximation will work well if $n$ is large and $p$ not close to zero or unity. There is an entire wiki page describing how to improve when these conditions fail.