When to use likelihood ratio test?

1k Views Asked by At

I have a few questions regarding the use of likelihood ratio test in a logistic regression model. Suppose we have a logistic regression model like this:

$P(Y=1)=\frac{\text{exp}\left(b_1+b_2X_2\right)}{1+\text{exp}\left(b_1+b_2X_2\right)}$

We now add two explanatory variables to the model, $X_3$ and $X_4$, so the model looks like this:

$P(Y=1)=\frac{\text{exp}\left(b_1+b_2X_2+b_3X_3+b_4X_4\right)}{1+\text{exp}\left(b_1+b_2X_2+b_3X_3+b_4X_4\right)}$

The value of the LR-test is compared to the chi squared distribution with degress of freedom equal to the difference in the number of parameters. So, for example, if two variables are added, the LR-value is compared to the chi squared distribution with two degrees of freedom.

The question now is: When performing a maximum likelihood estimation of the parameters, both the variables $X_3$ and $X_4$ proves insignificant. Should I perform a likelihood-ratio test between the two models, to confirm that the parameters does not add explanatory power to the model? Or is it possible to conclude that if both parameters are insignificant, there is no need to perform a LR-test at all?

The next question is considering the addition of only one variable. Is it possible to add one variable to the model that proves insignificant, and still get an significant LR-value? The only possible situation I can think of is when the added explanatory variable makes the other variables extremely more significant - but is that even possible at all when performing maximum likelihood estimation?

I hope that someone out there got an idea how to approach these questions. Thank you in advance :)

1

There are 1 best solutions below

1
On BEST ANSWER

Yes, it is possible that both $X_3$ and $X_4$ contribute something meaningful, but they are masking each other because they are correlated. The individual tests done at the coefficient level are tests of whether to include $\beta_j$ given all the other $\beta$'s, so if $X_3$ explains the same thing $X_4$ does then neither will add anything when the other is already in the model. You could use the LRT to compare the model containing $(X_1, X_2)$ to the model containing $(X_1, X_2, X_3, X_4)$ to test to see if $(X_3, X_4)$ jointly contribute to the model, or else fit the models containing $(X_1, X_2, X_3)$ and $(X_1, X_2, X_4)$ and see if either $X_3$ or $X_4$ is significant in those models. A "backward selection" algorithm would remove the least significant of $X_3$ and $X_4$ and then fit the reduced model to decide what to do next.

To your second question, yes, it is possible that adding a new predictor can greatly improve the usefulness of another predictor. In fact, this is usually the point behind "blocking" in design of experiments. The inference procedure you use (ML, Bayesian estimation, and so forth) doesn't really play into this.

I feel obliged to say that I don't particularly like these sorts of variable selection strategies. Depending on what my inference goal is, I might retain all the predictors regardless of whether some test says they are significant.