I am really confused by the answer to the following question:
The question asks:
Two examiners are marking an examination paper, and it is believed that examiner A is more strict than examiner B. The results from several papers are added together for each examiner and presented in the following table:
Examiner A: n = 16, sum = 689
Examiner B: n = 12, sum = 636
Test the claim at the 5% significance level, assuming the marks are normally distributed with a standard deviation of 15.
I'm happy that the answer uses the z-scores and not a t-test because we know the parent population's variance, but what confuses me is that the answer uses z(0.975) = 1.96 as the critical value; whereas, I think this is a one tailed test so it ought to use z(0.95) = 1.645.
Is the book wrong, or am I missing something here?
A screenshot of the answer is here. The answer book is not well formatted.
That was an awful solution. Hard to read, poorly phrased, and ultimately, incorrect.
You are correct. Even by the solution's admission, the hypothesis is one-sided: $$H_0 : \mu_A - \mu_B = 0 \quad \text{vs.} \quad H_1 : \mu_A - \mu_B < 0.$$ Therefore, the critical value $z_\alpha$ at a significance level of $\alpha = 0.05$ must satisfy $\Pr[Z < z_\alpha] = 0.05$, which corresponds to $z_\alpha \approx -1.645$. This also happens to change the conclusion of the test, since the test statistic is $$Z \mid H_0 = \frac{\frac{689}{16} - \frac{636}{12}}{15 \sqrt{\frac{1}{16} + \frac{1}{12}}} = -\frac{53}{20} \sqrt{\frac{3}{7}} \approx -1.73483.$$ Therefore, the correct conclusion is to reject $H_0$ at the $\alpha = 0.05$ level: the data furnishes evidence that $A$ is more strict than $B$.
Finally, even if it were the case that $H_0$ is not rejected, the appropriate conclusion is not to say "from the given data, Examiner A is not stricter than Examiner B." This is incorrect because it asserts that $H_0$ is true, when in fact, $H_0$ was assumed to be true in order to perform the test. That is to say, the test is performed under the assumption that the null hypothesis is true, hence it can never claim that the null is true. To do so would be like saying, "given that there is no difference in the mean scores, the data shows that there is no difference in the mean scores."
The correct interpretation of a failure to reject the null hypothesis is to state that there lacks evidence to reject the null. The test is inconclusive. This means the null may still be false, but the data that was observed is not able to show that it is false. You cannot "accept" $H_0$ because you assumed it was true in order to calculate the test statistic. Any resulting inference is conditioned on that assumption.