Hypothesis testing of binomialy distributed data

122 Views Asked by At

I haven't done any hypothesis testing for years since I left school and I just wanted to refresh my memory of it.

The hypothesis is stated as following: Assume that average high school student has a dropout rate of 70%. Alternative hypothesis would be that the dropout rate is less than 70%. Since student can only either stay in school or leave we can model that using Binomial distribution, when we sample.

Thus, we have:

Suppose $\theta$ is the probability that a student stays in school. Then

$$H_0 : \theta = 0.7 \quad \text{vs.} \quad H_a : \theta < 0.7.$$

The test statistic we will use is based on the binomial distribution. $X$ is the number of students in $n$ cases that stayed in school, then $$X \mid H_0 \sim \operatorname{Binomial}(n, \theta = 0.7).$$

Then I sample, say, $100$ students and count how many of them actually stayed in school. We observe that $57$ of them stay in school. Then, $p$ value would be $$p = \Pr[X \le 57 \mid H_0] = 0.00396779$$ Can we use normal approximation in this case? Also, what type of test would I need to use then? Left-sided?

Thanks!

1

There are 1 best solutions below

3
On BEST ANSWER

If we were to apply a normal approximation, we would model X as

$$X|H_{0}\sim\mathcal{N}\left(n\theta,n\theta\left(1-\theta\right)\right)$$

$$X|H_{0}\sim\mathcal{N}(70,21)$$

$$\sigma=\sqrt{21}$$

Thus, our p-value, under normal approximation, would be $\mathbb{P}(Z<\frac{\left(57-70\right)}{\sqrt{21}})$

This is a one sided Z-test, and we get a p value of 0.002327, which as you can see is quite a bit lower than yours(but is a nice quick order of magnitude approximation if you're away from a computer). It is well known that normal approximations to binomial give p values that smaller than they should be when we are deep into the tails. With only 100 people, you really don't need to apply a normal approximation