Meaning of the p-value

186 Views Asked by At

Suppos that we have a null-hypothesis $H_0: \ \theta=\theta_0$. Our alternative hypothesis could be for example $H_1: \ \theta\ne \theta_0$. We want to test the null-hypothesis so we construct a test-statistic $T$ which has some probability distribution. Based on the data we compute the numerical value of the test statistic to be $t$.

For some reason, we define $p:=2 \mathrm{min}\left\{\mathbb{P}\left(T\ge t\mid H_0 \right),\mathbb{P}(T \le t\mid H_0)\right\}$ and if the p-value is very small, we abandon the null-hypothesis. Could someone please clarify, how does the p-value imply that the null-hypothesis is incorrect? And also in cases when we have $H_1: \ \theta>\theta_0$ or $H_1: \ \theta < \theta_0 $. I think it would be more convenient to check the probability of having $T\in \left(t-\varepsilon, t+\varepsilon \right)$ assuming that the null-hypothesis holds.

2

There are 2 best solutions below

0
On BEST ANSWER

Just to clarify the problem look at the following example (it is only an example...)

enter image description here

Suppose you have a distribution like the one in the picture (it's a std gaussian) and suppose that your critical value is $z$. The area in the queue left to $z$ is your significance level, say $\alpha$.

If your test statistic (t stat) is the one I showed, it is clear that you are in the rejection area, very far from the center of the distribution and the p-value is the probability expressed by the purple area; It is evident that in this situation

$$\text{p-value}<\alpha$$

and this means: abandon the null hypothesis.

Now I think it is clear that "the lower is the p-value, the less is good $H_o$"


If the test is "two sided" the pvalue must be multiplied by 2...it is the area of both extreme queues

1
On

The observed $t$ is a realization of a random variable $T$ which we would expect to have values in a certain region if $H_0$ holds. To be precise: We define an interval such that $T$ has values in this interval with a probability of $1-p$ if indeed the null hypothesis $H_0$ is true.

So if we observe that $T$ does not lie in this interval, it is very unlikely that this data was generated under $H_0$.

As you noticed, this is a conditional probability concerning the observed data and not a probability of the hypothesis $H_0$ itself (if you want to assign probabilities to hypotheses you need the framework of Bayesian statistics).