Choosing H0 and Ha in hypothesis testing

18.1k Views Asked by At

There seems to be some ambiguity or contradiction in how to correctly choose the null and alternative hypotheses, both online and in my instructor's notes. I'm trying to figure out if this stems merely from my lack of understanding or if there actually is a disagreement in the scientific community at large. I've seen the following two ideas on choosing H0 and Ha

  1. The null hypothesis is the status quo, the state of things already accepted and/or shown to be true by previous data. We assume it to be true and need convincing evidence to reject it. The alternative hypothesis is the one being proposed based on data from the experiment in question, and is assumed to be false unless the data supporting it can convincingly show otherwise.

  2. The null hypothesis is always the one that includes the equality, and the alternative hypothesis is the complement to it. It doesn't matter whether the equality is the status quo or is being claimed by the researcher, it is always H0.

An example I made up myself for demonstrative purposes, I'm not looking for an actual solution. Only interested in the hypotheses:

A researcher believes that children in economically disadvantages areas are more likely to be raised in single-parent homes. He surveys 1000 children from such an area and finds that 317 of them are raised in a single-parent home. Can we conclude with 95% confidence that 30% or more of the children in economically disadvantages areas are raised in single-parent homes?

What would be the H0 and Ha in this case and why?
My professor provided the correct answer (for an equivalent question but with different numbers) to be

H0 : p >= 0.3; Ha : p < 0.3

With the rationale that H0 must include the equality, which in this case is greater or equal to 30%. Her solution then failed to reject the null hypothesis and concluded that the researcher's claim is therefore correct.
To me this seems like assuming the claim to be true to begin with and giving it the benefit of the doubt, which is the opposite of what I thought was the correct approach.

A professor in this related question Difference between "at least" and "more than" in hypothesis testing? seemingly took the same approach.

I wish I could talk to my professor about this, but unfortunately there's a significant language barrier.

3

There are 3 best solutions below

5
On

Your null hypothesis is $H_0:p=0.3$

The alternative hypothesis is $H_1:p>0.3$

You need to calculate $$p(X\geq317)$$ using $X\sim Bin(1000,0.3)$

Can you finish?

Just to clarify:

  1. The null hypothesis always has an equal sign and never an inequality symbol
  2. In this particular example we conclude that $317$ is not in the critical region.

We conclude that in accepting the null hypothesis there is insufficient evidence that the probability is more than $30$%

1
On

Both ideas of the null and alternative hypothesis are true. The null hypothesis must always include an equals sign, whether it be $\geq\text{, } \leq\text{, or just}=$. Usually, however, it's just $=$. The alternative hypothesis is what we wish to show.

The null hypothesis in this case is that the proportion of children in economically disadvantaged areas raised in single-parent homes is $30$%.

The alternative hypothesis is that the proportion of children in economically disadvantaged areas raised in single-parent homes is greater than $30$%.

More formally

$$H_0 : p=0.3$$

$$H_a : p \gt 0.3$$

There are two ways you can test this hypothesis if you so wish. Letting $X$ be the number of children raised in single-parent homes, you can use normal approximation to the binomial:

$$P(X\geq317)=1-P(X\lt317)=1-\Phi\left(\frac{316.5-300}{\sqrt{1000\cdot0.3\cdot0.7}}\right)$$

where I used a continuity correction

In R statistical software

> 1-pnorm((316.5-300)/sqrt(1000*.3*.7))
[1] 0.1274333

You could also, using software, find the exact probability using the standard binomial distribution:

$$P(X\geq317)=\sum_{k=317}^{1000} {1000 \choose k}\cdot0.3^k\cdot0.7^{1000-k}$$

> sum(dbinom(317:1000,1000,.3))
[1] 0.1277011

Since $n$ is large, the normal approximation does very well.

At $\alpha=0.05$ we fail to reject the null hypothesis.

1
On

You always have to choose $H_a$ so that the sample’s estimation fulfills $H_a$.

The reason is that otherwise the rejection rule will always vote for $H_0$ as in the incorrect choice of your professor.

In your case you want to test a probability against $0.3$, the sample’s estimation was $0.37$, hence $H_a\colon p>0.3$ as $0.37>0.3$. And it does in no way matter where the equal-sign occurs as long as you’re dealing with continuous random variables.