How can $t$-statistic be used to test hypothesis?

248 Views Asked by At

I have the following question: A random sample of size 25 from a normal distribution has mean 47 and standard deviation 7. Based on $t$-statistics, can we say that the given information supports the conjecture that the mean of the population is 42?

I'm really confused how $t$-statistics works to reject or fail to reject a hypothesis. An explanation would be really helpful. Thanks!

1

There are 1 best solutions below

2
On BEST ANSWER

Two-Sided One-Sample T -Test

Just happened to have a normal dataset with $n=25, \bar X = 57, S = 7$ in my R Session window.

Are data appropriate for a t test? Here is a summary of the data, computed by R:

summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  35.18   40.78   44.83   47.00   52.35   61.34 
length(x); sd(x)
[1] 25   # sample size n = 25
[1] 7    # sample standard deviation S = 7.0

stripchart(x, pch="|")

enter image description here

Approximately symmetrical data with no far outliers; passes Shapiro-Wilk normality test with a P-value above $0.05 = 5\%.$

shapiro.test(x)

        Shapiro-Wilk normality test

data:  x
W = 0.96136, p-value = 0.4423

Data are close enough to normal for a t test to be valid.

R printout for the t test. Thus, here is output from R for a one-sample t test of $H_0: \mu = 42$ against $H_a: \mu \ne 42.$

t.test(x, mu=42)

        One Sample t-test

data:  x
t = 3.5714, df = 24, p-value = 0.001543
alternative hypothesis: 
  true mean is not equal to 42
95 percent confidence interval:
  44.11054 49.88946
sample estimates:
mean of x 
       47 

Interpretation of output. The P-value is $0.0015 < 0.05 = 5\%,$ so you would reject $H_0$ at the 5% level of significance. You could also reject at the 1% level.

The output also gives a 95% confidence interval (CI) $(44.11, 49.89),$ so we can conclude the true value of $\mu$ is in that interval--which does not contain $\mu = 42.$

One interpretation of this CI is that it is an interval of "non-rejectable" null hypotheses, based on your data.

Details your should know about the test. @PeterForeman has shown you how to compute the T-statistic. Except for the P-value, you should be able to reproduce everything else in the output by hand computation.

  • Exact P-values are given in computer printouts. By looking at a printed table of t, you should be able to 'bracket' the P-value. For example, my table has values 2.467 and 3.745 on line DF = 24, which bracket the T-statistic 3.5714. Looking at the top margin of my table, I see that the P-value must be between $2(0.001) = 0.002$ and $2(0.0005) = 0.001,$ which agrees with the value from R. [The 2s are because this is a 2-sided t test.]

  • You can get the exact P-value of this 2-sided test in R or other statistical software. It is the probability of a T statistic farther from $0$ than the observed $T =3.5714.$ In R, where pt is a CDF of Student's t distribution, the following computation gets you very close to the P-value in the printout. (If the value of the reported T statistic is rounded, then the P-value may not match exactly, but only the first couple of decimal places matter for decision making.)

.

2 * (1 - pt(3.5714, 24))
[1] 0.001543522
  • To answer one of your questions in comments: From the printed t table, you can say that a critical value for rejecting at the 5% level is $c = 2.064.$ That is you would reject at the 5% level of $|T| > 2.064,$ which it is. The critical value cuts probability $0.025 = 2.5\% $ from the upper tail of Student's t distribution with DF = 24. In R, where qt is a quantile function (inverse CDF), you can get the 5% critical value as shown below. What is the critical value for a test at the 1% level of significance?

${}$

qt(.975, 24)
[1] 2.063899

Graphical summary. The figure below shows the density function of Student's t distribution with 24 DF. The vertical blue like shows the observed value of the T-statistic. The P-value is twice the area under the curve to the right of this line. Lower and upper critical values for a test at the 5% level are shown by vertical dotted orange lines; red lines (farther out) for a test at the 1% level.

enter image description here