On a math test at a certain large school, a random sample of $61$ women had a mean score of $64$ and standard deviation of $14$ and a random sample of $61$ men had a mean score of $60$ and standard deviation of $12$. Can we conclude that the women at this school are better in math than men with $5\%$ level of significance?
Which test should be used?
Comment: I assume those are sample standard deviations. Also that the populations of test scores are normal.
There are two kinds of two-sample t tests: 'pooled' and 'separate variances (Welch)'. Unless you have strong reason (aside from the data) to believe population variances are equal, then the Welch test is preferred. Output from Minitab statistical software for the Welch test, based the data summaries provided, is shown below:
The crucial bit of output is nearly the same for the pooled test:
T-Value = 1.69 P-Value = 0.046 DF = 120.The null hypothesis is rejected (just barely) at the 5% level. I will leave it to you to match computations from formulas in your text with those of Minitab.