Statistics A Level Errors in Hypothesis Testing

Question

Statistics A Level Errors in Hypothesis Testing

98 Views Asked by Bumbble Comm At 10 May 2026 - 10:28

In a large company the time taken for an employee to carry out a certain task is a normally distributed Random variable with mean $78.0s$ and unknown variance. A new training scheme is introduced and after its introduction the times taken by a random sample of $120$ employees are recorded. The mean time for the sample is $76.4s$ and an unbiased estimate of the population variance is $68.9s$.

(i) Test, at the $1$% significance level, whether the mean time taken for the task has changed.

(ii) It is required to redesign the test so that the probability of making a Type I error is less than $0.01$ when the sample mean is $77.0s$. Calculate an estimate of the smallest sample size needed, and explain why your answer is only an estimate.

This question is taken from a past A level stats $2$ paper. I'm having trouble understanding part ii) of this (part i included in case it's relevant). The mark scheme says to standardise $1$ with $n$ and $2.576$, which I understand is the critical level of the test, but I really don't understand what's going on here so any explanation would be greatly appreciated!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 19 May 2018 - 10:51

For part (ii), you are being asked to provide the minimum sample size where a difference of only one in the mean is significant at the $99\%$ level. The probability of making a type I error is $1\%$ at the $99\%$ level so this is just an indirect way of saying "the $99\%$ level". In other regards it's kind of backwards because you don't know the mean you are dealing with until you've calculated it with a known $n$.

For both $t$ and $Z$ equations, the only difference is in the standard deviation, sample versus population. $$t(Z) = \frac{\bar x - \mu_0}{s(\sigma)/\sqrt{n}}=\frac{1\sqrt n}{\sqrt{68.9}} = \frac{\sqrt n}{8.3}$$

Looking at a $t$ table, the lowest value in the column at the $99\%$ significance level is $2.576$ to which they reference a $Z$ for the $df$, meaning $Z = 2.576$ at the $99\%$ level. $$2.576 = \frac{\sqrt n}{8.3}$$ $$n = (2.576\cdot 8.3)^2 = 457.14$$

The reason this is only an estimate is because we used the sample standard deviation instead of the population standard deviation for a $Z$ score calculation of n.

**Bumbble Comm** · Accepted Answer

The problem, as posted seems flawed in several ways, so I will answer step by step.

First, the units of an unbiased estimate of $\sigma^2$ must be in squared units, here $S^2 = 68.9sec{}^2.$ Then the (very slightly biased) estimate of $\sigma$ is $S = 8.3s.$

In (i) it seems that you must assume that the historical population mean is $\mu_0 = 78s.$ Presumably, this is based on a large enough sample that this population mean for the old training scheme is known with negligible error.

Then, to test $H_0: \mu = 78$ against $H_1: \mu \ne 78,$ based on a sample of size $n = 120$ with sample mean $\bar X = 76.4$ and $S = 8.3,$ one would obtain the test statistic $$T =\frac{76.4 - 78}{8.3/\sqrt{120}} = -2.111701.$$ Under $H_0,$ the test statistic has Student's t distribution with 119 degrees of freedom. The value that cuts probability $0.005$ from the upper tail of this distribution is $t^* = 2.618$ (sometimes sloppily approximated by a standard normal distribution as $t^* \approx 2.576).$ Because $|T| = 2.112 < 2.618$ (or 2.576, if you insist), $H_0$ is not rejected at the 1% level of significance, but is rejected at the 5% level. [The advice to approximate t by z when $n > 30$ is reasonably good for tests at the 5% level, but not so good at the 1% level--as we have just seen.]

The P-value 0.0368 of this two-sided t test is the sum of the areas under the density curve of $\mathsf{T}(df = 119)$ to the left of -2.112 and above 2.112.

Relevant computations in R statistical software for some values given above are shown below:

qt(.005, 119)
[1] -2.617776
qnorm(.005)
[1] -2.575829
pt(-2.112, 119)*2
[1] 0.03677776

Relevant output from Minitab statistical software is as follows:

One-Sample T 

Test of μ = 78 vs ≠ 78

  N    Mean  StDev  SE Mean       99% CI           T      P
120  76.400  8.300    0.758  (74.417, 78.383)  -2.11  0.037

For (ii) it seems that someone wants to increase the sample size so that if $\bar X = 77$ for the larger sample, then one would (just barely) reject $H_0$ against the two-sided alternative. The unwarranted assumption here is that $\sigma$ for the population trained under the new scheme would match the sample standard deviation $S = 8.3$ of the current experiment. In that case, we would have a z test with test statistic $|Z| = \frac{|77 - 78|}{8.3/\sqrt{n}} = 2.576.$ So $n = 458$ should suffice.

In a real application, a more worthwhile problem would be to find the sample size required to detect, with probability $95\%,$ a mean difference of 1 second if $\sigma = 8.3$ in a test of size $\alpha = 1\%$ or $5\%.$ That is, we seek $n$ that would give power 0.95 under such conditions. For a test at the 5% level, the Minitab output below answers this question as $n = 898,$ using an exact computation based on t distributions. [This method requires use of non-contral t distributions; for moderately large sample sizes, a useful approximation can be found using the standard normal distribution.] For a test at the 1% level the answer is $n = 1231$ (output not shown).

Note: Because the company likely wants to reduce the time it takes employees to do the task under discussion, it seems to me that the appropriate hypotheses would be $H_0: \mu = 78$ vs. $H_a: \mu < 78,$ but the statement of the problem clearly calls for a two-sided alternative. If you are studying for A level exams, you might try working the one-sided version on your own. Maybe the exam you take will have been prepared with more attention to practical concerns.

Statistics A Level Errors in Hypothesis Testing

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in HYPOTHESIS-TESTING

Trending Questions

Popular # Hahtags

Popular Questions