Two different cases of uniform hypothesis testing

60 Views Asked by At

I have two different p-value uniform-distribution problems. I know that the definition of the p-value is: The probability of observing a new $X$ at least as extreme or more extreme than the initial $X$.

Problem I:

$X$ has a uniform distribution on interval $[0, z]$. We test $H_0: z=3$ against $z >3$ as test static we take $X$. We observe $x=1$, what is the p-value?

Problem II:

We have a collection of tanks numbered from $1$ to $K$ and $20$ of them are chosen as sample with putting back. We want to test $H_0: K = 100000$ versus $H_1: K<100000$. The test statistic is the max number from the sample $M.$ Assume $M= 81115$ what is the p-value?

The first p-value is the region right of the observed p-value $(2/3)$. The second p-value is the region left of the observed value $(81115/100000)^{20}$. Initial I thought I had to look at the sign of the $H_1$. But that 'assumption' doesn't hold with the theory.

I see how the definition of the p-value holds in the first one. You want to know the chance of observing a value $x$ bigger or egual than 1, so you take the right region.

I don't see how the definition holds in case of problem 2. I think I'm confused by the test statistic $\max\{\}$. So with the same line of thought. What is the probability of achieving the same or more extreme result? Here I get stuck, I dont see why the left region is calculated. I do see why it is ^20, because of the independence of 20 observations.

Question 1: Why is in problem 2 the left region taken?

Question2: Here is the test statistic $\max\{x_1,x_2,\ldots,x_n\}$, but what should I do when the test statistic is $\operatorname{MEAN}\{x_1,x_2,\ldots,x_n\}$, or $\min\{x_1,x_2,\ldots,x_n\}$? Is there a derivation somewhere I can look into?

1

There are 1 best solutions below

0
On BEST ANSWER

The $p$-value is the probability of observing a value of the test statistic that is at least as extreme as what you observed, given that the null hypothesis is true. In the case of the second question, this means observing a maximum tank number that is $81115$ or smaller, given that there are $K = 100000$ tanks. The reason why is because if $M$ is the maximum tank number observed in the sample, and the alternative hypothesis is that there are fewer than $100000$ tanks, the smaller the value of $M$ you observe, the more evidence you have in favor of rejecting $H_0$. Consequently, smaller $M$ values are considered "more extreme" than larger ones. To illustrate, if indeed it was true that there are $100000$ tanks, and you observed $M = 37$, is that very likely? You'd have to pick, out of $20$ tries, tanks with numbers not exceeding $37$ every time. The probability of such an event is $(37/100000)^{20} \approx 2.31225 x 10^{-69}$.

This is why your $p$-value for the second question is $(81115/100000)^{20} \approx 0.0152063$, because this is the probability that the maximum tank number in a sample of size $n = 20$ is numbered $81115$ assuming there are $100000$ tanks. It's not impossible, but somewhat unlikely: each tank in your sample had only a $0.81115$ probability of being a number not exceeding the value of your statistic.