Proper hypothesis test setup

74 Views Asked by At

Suppose lower $95\%$ confidence bound for $X$ is $(40,\infty)$.

Given $X=40$, design a hypothesis test, i.e., just show $H_0$ and $H_1$ using the information given above.

I have tried this:

$H_0: X = 40$
$H_1: X < 40$

The reason I have designed it this way is because we know that in all randomized trials, $X > 40$, 95 percent of the time. So, a reasonable alternative hypothesis would test if $X < 40$ (similar to a left-tail test with $\alpha=0.05$) since we have a lower confidence bound.

Is this right?

1

There are 1 best solutions below

0
On

Null and alternative hypotheses need to be stated in terms of an unknown population parameter. Here I suppose that is $\mu.$

Typically, data are a sample of size $n$ from a population: $X_1, X_2, \dots, X_n$ and one would use $\bar X =\frac 1n \sum_{i=1}^n X_i$ to estimate $\mu$ and to test a hypothesis such as $H_0: \mu = 40$ against the alternative $H_1: \mu < 40.$

There would be no point in having a null hypothesis involving $\bar X$ because we know the value of $\bar X$ from the data.

I can't quite make sense of your problem and know nothing about the level of your course, but here is my best attempt to show you an example of a test that may be helpful.

Suppose we have $n = 10$ observations from a population with unknown population mean $\mu.$ Displayed and summarized in R, data may be as follows:

x
[1] 38.5 38.2 32.6 32.5 25.8 39.9 40.8 36.1 44.0 37.5
summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  25.80   33.48   37.85   36.59   39.55   44.00 
sd(x)
[1] 5.163644
stripchart(x, pch="|")

enter image description here

From the above, we know that $\bar X = 36.59$ is smaller than $40$ and that most of the observations are below $40.$ The question is whether we have enough evidence with these ten observations to claim that $\bar X$ is 'significantly' less than $40$ in a statistical sense, so that we can reject $H_0$ and act as if $\mu < 40.$

Here is a t test from R, which also shows a one-sided confidence interval, such as the one you mentioned.

t.test(x, mu=40, alt="less")

        One Sample t-test

data:  x
t = -2.0883, df = 9, p-value = 0.03318
alternative hypothesis: 
  true mean is less than 40
95 percent confidence interval:
     -Inf 39.58327
sample estimates:
mean of x 
    36.59 

In this case, with a left-sided alternative $H_1: \mu < 40,$ the 95% confidence for $\mu$ is in the form of an upper bound: $\mu \le 39.58.$ You could also write it as $(-\infty, 39.58).$

Because the P-value $0.033$ of the test is less than $0.05 = 5\%,$ we can say that the null hypothesis is rejected at the 5% level of significance.


Notes: (1) Notice the my alternative $H_1: \mu < 40$ leads to a one-sided confidence interval that gives an upper bound on $\mu.$ So if you want a one-sided CI that gives a lower bound, you would need to have an alternative stating the $\mu$ exceeds some value. [I used a left-sided alternative because that seemed to be what you were suggesting in your question, and I wanted you to see where that leads.]

If you change to $H_1: \mu > 30$ and change the syntax of the t test to

t.test(x, mu=30, alt="greater")

then you get p-value = 0.001474 and the 95% CI $(33.59673, \infty).$ [You are far from being the first student to get this turned backwards on the first try.]

(2) In using a t test I have assumed that the population from which the data were sampled is normally distributed. Also, I am assuming that the population standard deviation $\sigma$ is unknown and is estimated by the sample standard deviation $S = \sqrt{\frac{1}{n-1}\sum_{i=1}^n(X_i-\bar X)^2}.$ In this example $S = 5.1636,$ as shown above.

(3) The data I show above were sampled using R as follows.

set.seed(2020) 
x = round(rnorm(10, 37, 4), 1)

You can see that the mean of the population from which the data were sampled has $\mu=37.$ Of course, in an actual application, you would never know the true value of $\mu.$