Is it normal for $p$ value to be all $1$s in multiple testing

33 Views Asked by At

I am performing hypothesis testing on a set of data $X$ with the corresponding response $y$.

The null hypothesis is: the set of data $X$ is irrelevant to the response $y$, which means $y = \epsilon$, where $\epsilon$ is some Gaussian noise.

The alternative hypothesis is: there is a linear relationship between $X$ and $y$, which means $y = X\beta + \epsilon$.

If you are familiar with the term "variance component estimation", I am exactly doing it, so suppose the variance of $y$ that can be explained by the data is $\sigma_g^2$, and can be explained by the noise is $\sigma_{\epsilon}^2$. I am testing whether $\sigma_g^2 = 0$.

Suppose the null hypothesis is true and I did the experiment on the same $y$ and multiple set of data $X$, then I expect to see that the p-value is uniformly distributed between 0 and 1, (qq-plot is a diagonal line). Which is also what I have observed.

However, if the alternative hypothesis is true and I first regress out the linear effect, so that I use the residual $res = y - X\hat{\beta}$ to perform the hypothesis testing, then do I still expected to see uniformly distributed p-values? My original guess is yes, but then I did the experiment and find all of the p-values are very closed to 1. Is it something expected and why?

Thank you!