Normal approximation of a binomial distribution

950 Views Asked by At

A car assembly line produces 1920 cars per shift. A defect rate of 3% is considered acceptable. From the production of one recent shift, 65 cars were found to be defective. What is the probability of this occurrence. My calculations are as follows:

$$\text{mean}=np=1920\cdot0.97=1862.4$$ standard deviation = $$\sqrt{npq}=\sqrt{1920\cdot 0.03\cdot 0.97}=7.475$$

This is where my answer differs from the book answer. I calculated the probability of this occurrence as exactly 65 cars being defective. I said the probability given by the normal approximation will be the area under the curve between the values of 1854.5 and 1855.5 because 1920-65=85 and I used the continuity correction. This yields an answer of .0327. The answer in my textbook is .178. This answer can be achieved by calculating the area when x is less than 1855.5. I beleive this answer to be wrong because then you are also accounting for all the cases between 65 cars being defective and all the cars being defective. But maybe this is the propper interpretation of the question. If it says "65 cars were found to be defective," does that mean there can be more defective cars that weren't found, or exactly 65 cars are defective.

Thanks for your help.

2

There are 2 best solutions below

0
On BEST ANSWER

The question of interest as it pertains to the context of the problem, is not "what is the probability that exactly 65 cars were found to be defective," but rather, "what is the conditional probability that we observed at least 65 defective cars out of 1920 cars, assuming that the actual defect rate is 3 percent?" This is how we construct a meaningful test of a statistical hypothesis, and this is what the question is really asking. It is basically saying, "if the actual defect rate is only 3 percent, how likely would it be to observe a defect rate as extreme as 65/1920?" If this probability is highly unlikely, then we can reason that the original premise is likely to be flawed; i.e., that the true defect rate is not merely 3 percent, but probably higher.

But if we were to calculate a probability of seeing exactly 65 defects in 1920 cars, then that tells us very little of interest, for such a probability does not capture the idea that any observed proportion of defects exceeding 65/1920 also lends evidence to suggest that the true defect rate is higher than the hypothesized rate of 3 percent.

Now, we can argue that the wording of the question is vague and not sufficiently precise from a mathematical or statistical standpoint, and that would be a reasonable criticism. However, because such types of hypotheses are so well-known and familiar in statistical practice, and because the interpretation of the probability as counting the event of exactly 65 defects is not particularly meaningful in an applied context, the intention of the question, especially to those who are familiar with statistical hypothesis testing, is clear.

0
On

It's a badly phrased question unless some larger context alters the meaning. There is a suggestion of such a broader context where it says this:

A defect rate of 3% is considered acceptable.

A null hypothesis might say that the defect rate is not more than 3%. (That strikes me as being unreasonably high, but I don't really know anything about that.) In that case evidence against the null hypothesis would consist of a high defect rate in a sample, and the "p-value" is the probability that the evidence against the null hypothesis would be at least as strong as what was actually observed, given that the null hypothesis is true. Finding the p-value seems to be what was intended. Still, I would not have written "this event"; rather I might just ask for the p-value, or perhaps ask what the probability is that the evidence against the null hypothesis is at least as strong as what was observed, given that the null hypothesis is true.