Situation: A t-test is being used, population mean is known as well as sample mean, sample standard deviation and n = 31 . The statement is "Is the true mean goals per game µ for soccer players in the 2014 - 2015 season still 12.5"
Is the null hypothesis, HO, µ = 12.5
How do I formulate a null hypothesis? Is the above correct? What question do I ask myself?
Then would the alternative hypothesis be, Ha != 12.5?
Follow up question, when I am conducting a t-test, and if my t-value comes out to be 0.13, would I have to double this when finding the p-value because I am conducting a two-sided t-test?
Thanks, hope this post is clear!
Yes. $H_0: \mu = 12.5$ against $H_a: \mu \ne 12.5.$ You would use the test statistic $T = \frac{\bar X = \mu_0}{s/\sqrt{n}}.$ And you would reject $H_0$ at level $\alpha = 5\%$if $|T| > q^*,$ where $q^*$ cuts probability $0.025$ from the upper tail of Student's t distribution with $n-1$ degrees of freedom. With $n = 31,$ you would have $t^* = 2.052.$ I got this value using R statistical software (as below), but you could get it by looking at row $df = n-1 = 30$ of a printed table of the t distribution.
Of course you can't expect $\bar X$ to be exactly 12.5 this year. The question is whether it differs by enough to make you doubt that the population mean is still 12.5. If $|\bar X - 12.5|$ is large enough that $|T| > 2.052,$ then one says that the difference is 'statistically significant' and rejects $H_0.$
If $T = 0.13$ then the P-value is $$P(|T| \ge 0.13) = P(T \le -0.13) + P(T \ge 0.13) = 2P(T \le -0.13),$$ where the last equal sign is explained by the symmetry of the t-distribution about $0$. So Yes you would need to double the probability in one tail to get the two-sided P-value.
You may be able to get a rough idea of the P-value by clever use of a printed t table, but largely P-value is a criterion of the computer age. In R you could get the P-value as follows:
Because the P-value is larger than 5%, you would not reject $H_0$ upon obtaining this value of $T.$ In the graph below, the P-value is the sum of two areas under the $\mathsf{T}(df=30)$ density curve: to the left of the left-hand red dotted line and to the right of the right-hand line.