If we have a hypothesis $H_0$ and alternative hypothesis $H_1$, a test statistic $T$ and a data set $x_1, ..., x_n$ taken from some random sample $X_1, ..., X_n$, we use $T(x_1, ..., x_n) = t$ to decide whether or not to reject $H_0$. What I thought is, if we choose adequate $T$, we can make the probability of observing an event at least as extreme as $t$ ( $P(T>t)$, for example ), be low, and thus in favor of $H_1$. So if we choose an appropriate $T$ we can deny $H_0$. My question is: is this an easy job, is it reasonable to search for such $T$ and how can we say that a test statistic is efficient?
Good test statistic
106 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
Simple example. Suppose you suspect a die may give 1s with probability less than $1/6.$ So you test $H_0: p = 1/6$ vs $H_a: p < 1/6.$ You roll the die $n = 120$ times and count then number $X$ of 1's. This test statistic is the sum of 120 Bernoulli random variables. Under $H_0$ (that is, assuming $H_0$ to be true), we have $X \sim \mathsf{Binom}(100, 1/6).$
By intuition, if $X$ is very small (much smaller than $E(X) = np = 20,$ anyhow) we would reject $H_0$ in favor of $H_a.$ We want the significance level of the test to be something like 5%. That means we'd reject $H_0$ about 5% of the time when the die is fine and $p = 1/6.$
Some experimentation in R (where pbinom is a binomial CDF and qbinom is the inverse quantile function) we find that we can have a test at significance level $0.0501$ or at level $0.0275.$
qbinom(.05, 120, 1/6)
[1] 13
pbinom(12:13, 120, 1/6)
[1] 0.02753237 0.05013085
Suppose we choose level $0.0501$ so that the 'critical value' of the test is $c = 13.$ That is, we reject $H_0,$ if $X \le c = 13,$ otherwise not.
The consequence is that we will make a Type I Error (rejecting $H_0$ when it is true) with probability very nearly 5%.
Now, one way to judge whether this is a 'good' test is to look at its 'power'. In particular, we may ask what is the rejection probability if the die has $p = 1/9.$ Formally, we seek $P(X \le 13 | p = 1/9) = 0.5344$ --- more than half the time. We say the power against the alternative $p = 1/9$ is $\pi(1/9) = 0.5344.$ The power is the probability of not making a Type II error.
pbinom(13, 120, 1/9)
[1] 0.5343601
Similarly $\pi(1/12) = 0.8741.$
pbinom(13, 120, 1/12)
[1] 0.8740746
One can make a 'power curve' for powers of a test at level 5% against various alternative values: With alternatives values of $p$ in $(0, 1/6)$ the rejection probabilities can be graphed as follows.
p = seq(.001, .166, by=.001)
pwr = pbinom(13, 120, p)
plot(p, pwr, type="l", ylim=c(0,1), main="Power Curve: Level 0.05")
abline(v=c(0,1/6), col="green2")
abline(h = .05, col="red")
points(c(1/9, 1/12), c(.5344, .8741), col="blue", pch=19)
In this figure the horizontal red line shows the significance level $\alpha = 0.05.$ The two heavy blue dots represent the specific power values (for alternatives $p = 1/9$ and $p = 1/12),$ computed above.
Notes: (1) For a test at the 5% level, the only way to get better power against a particular alternative is to increase the sample size $n$.
(2) Some texts frame the hypothesis and alternative as $H_0: p \ge 1/6$ vs $H_a: p < 1/6.$ But then $p = 1/6$ is still used for the null distribution.

You can not accept H0 only deny. So the hypotese we want to prove true is what we call H1. And to say that the test statics is efficient you need the significance level.