When independent variables covary (not correlate) in the regression, why this happen?

Question

When independent variables covary (not correlate) in the regression, why this happen?

38 Views Asked by Bumbble Comm At 02 Apr 2026 - 10:09

This is a minimum reproducible example my real simulation work for research has error variance, but I want to see what is happening in this case so I made this WIThOUT error variance so that I can see clearly

I attach r code

x1<-rnorm(999,0,1)
x2<-rnorm(999,0,1)
y <- x1+x2
iv1<-999999*x1
iv2<-999999*x2

cov(x1,x2) # nearly 0
cor(x1,x2) # nearly 0
cov(iv1,iv2) # very big, 
cor(iv1,iv2) # nearly 0

summary(lm(y~x1+x2+x1*x2)) # interaction p=0.11

summary(lm(y~iv1+iv2+iv1:iv2)) #interaction significant.


Call:
lm(formula = y ~ x1 + x2 + x1 * x2)

Residuals:
       Min         1Q     Median         3Q        Max 
-3.959e-14 -1.150e-16  1.400e-17  9.100e-17  4.756e-14 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)    
(Intercept) -7.728e-17  6.269e-17 -1.233e+00    0.218    
x1           1.000e+00  6.090e-17  1.642e+16   <2e-16 ***
x2           1.000e+00  6.474e-17  1.545e+16   <2e-16 ***
x1:x2       -1.027e-16  6.630e-17 -1.550e+00    0.122    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.976e-15 on 995 degrees of freedom
Multiple R-squared:      1, Adjusted R-squared:      1 
F-statistic: 1.631e+32 on 3 and 995 DF,  p-value: < 2.2e-16

> summary(lm(y~iv1+iv2+iv1:iv2))

Call:
lm(formula = y ~ iv1 + iv2 + iv1:iv2)

Residuals:
       Min         1Q     Median         3Q        Max 
-4.137e-14 -1.010e-16  2.100e-17  1.050e-16  3.623e-14 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)    
(Intercept) -7.728e-17  5.581e-17 -1.385e+00   0.1665    
iv1          1.000e-06  5.422e-23  1.844e+16   <2e-16 ***
iv2          1.000e-06  5.764e-23  1.735e+16   <2e-16 ***
iv1:iv2     -1.217e-28  5.902e-29 -2.062e+00   0.0395 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.759e-15 on 995 degrees of freedom
Multiple R-squared:      1, Adjusted R-squared:      1 
F-statistic: 2.058e+32 on 3 and 995 DF,  p-value: < 2.2e-16

when I make y, I did not put interaction effect. but why x1:x2 interaction (p=0.11) and why iv1:iv2 interaction is significant?

Why covariance matters?

duplicated from [https://stats.stackexchange.com/questions/579271/when-independent-variables-covary-not-correlate-in-the-regression]

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

It is because R cannot handle decimals that small. The model should be

$y=1/999999iv1+1/999999iv2$ for model 2. If you do iv1=99999v1 and iv2=99999v2, you will get a significant interaction much fewer times. But because the numbers are so extreme the software cannot handle it and it sometimes finds a false interaction. Note the scale of the response variable. y is a decimal around 0 to 2. iv1 is 100000 to 2000000. It correctly gets the coefficient for iv1 and iv2, 1/999999=10e-6. But then for iv1:iv2, it sometimes throws in an interaction coefficient of 10e-29 or some very small number and makes the interaction $\times$ iv1 $\times$ iv2 very, very small, like on the order of 10e-18. So sometimes it finds an interaction when there really shouldn't be.

When independent variables covary (not correlate) in the regression, why this happen?

There are 1 best solutions below

Related Questions in REGRESSION

Related Questions in COVARIANCE

Related Questions in CORRELATION

Trending Questions

Popular # Hahtags

Popular Questions