In a large company the time taken for an employee to carry out a certain task is a normally distributed Random variable with mean $78.0s$ and unknown variance. A new training scheme is introduced and after its introduction the times taken by a random sample of $120$ employees are recorded. The mean time for the sample is $76.4s$ and an unbiased estimate of the population variance is $68.9s$.
(i) Test, at the $1$% significance level, whether the mean time taken for the task has changed.
(ii) It is required to redesign the test so that the probability of making a Type I error is less than $0.01$ when the sample mean is $77.0s$. Calculate an estimate of the smallest sample size needed, and explain why your answer is only an estimate.
This question is taken from a past A level stats $2$ paper. I'm having trouble understanding part ii) of this (part i included in case it's relevant). The mark scheme says to standardise $1$ with $n$ and $2.576$, which I understand is the critical level of the test, but I really don't understand what's going on here so any explanation would be greatly appreciated!
The problem, as posted seems flawed in several ways, so I will answer step by step.
First, the units of an unbiased estimate of $\sigma^2$ must be in squared units, here $S^2 = 68.9sec{}^2.$ Then the (very slightly biased) estimate of $\sigma$ is $S = 8.3s.$
In (i) it seems that you must assume that the historical population mean is $\mu_0 = 78s.$ Presumably, this is based on a large enough sample that this population mean for the old training scheme is known with negligible error.
Then, to test $H_0: \mu = 78$ against $H_1: \mu \ne 78,$ based on a sample of size $n = 120$ with sample mean $\bar X = 76.4$ and $S = 8.3,$ one would obtain the test statistic $$T =\frac{76.4 - 78}{8.3/\sqrt{120}} = -2.111701.$$ Under $H_0,$ the test statistic has Student's t distribution with 119 degrees of freedom. The value that cuts probability $0.005$ from the upper tail of this distribution is $t^* = 2.618$ (sometimes sloppily approximated by a standard normal distribution as $t^* \approx 2.576).$ Because $|T| = 2.112 < 2.618$ (or 2.576, if you insist), $H_0$ is not rejected at the 1% level of significance, but is rejected at the 5% level. [The advice to approximate t by z when $n > 30$ is reasonably good for tests at the 5% level, but not so good at the 1% level--as we have just seen.]
The P-value 0.0368 of this two-sided t test is the sum of the areas under the density curve of $\mathsf{T}(df = 119)$ to the left of -2.112 and above 2.112.
Relevant computations in R statistical software for some values given above are shown below:
Relevant output from Minitab statistical software is as follows:
For (ii) it seems that someone wants to increase the sample size so that if $\bar X = 77$ for the larger sample, then one would (just barely) reject $H_0$ against the two-sided alternative. The unwarranted assumption here is that $\sigma$ for the population trained under the new scheme would match the sample standard deviation $S = 8.3$ of the current experiment. In that case, we would have a z test with test statistic $|Z| = \frac{|77 - 78|}{8.3/\sqrt{n}} = 2.576.$ So $n = 458$ should suffice.
In a real application, a more worthwhile problem would be to find the sample size required to detect, with probability $95\%,$ a mean difference of 1 second if $\sigma = 8.3$ in a test of size $\alpha = 1\%$ or $5\%.$ That is, we seek $n$ that would give power 0.95 under such conditions. For a test at the 5% level, the Minitab output below answers this question as $n = 898,$ using an exact computation based on t distributions. [This method requires use of non-contral t distributions; for moderately large sample sizes, a useful approximation can be found using the standard normal distribution.] For a test at the 1% level the answer is $n = 1231$ (output not shown).
Note: Because the company likely wants to reduce the time it takes employees to do the task under discussion, it seems to me that the appropriate hypotheses would be $H_0: \mu = 78$ vs. $H_a: \mu < 78,$ but the statement of the problem clearly calls for a two-sided alternative. If you are studying for A level exams, you might try working the one-sided version on your own. Maybe the exam you take will have been prepared with more attention to practical concerns.