Regression without linearity

119 Views Asked by At

Given two independent, standard-normally distributed random variables $x,y\sim \mathcal{N}(0,1).$ I would like to do an univariate linear regression without intercept $Y = X \cdot \beta + \epsilon.$ R gives me as estimate $\beta = 0$

n <- 10000 x <- rnorm(n) y <- rnorm(n) plot(x,y)enter image description here fit <- lm(y ~ 0 + x) summary(fit)

but I feel the problem is not well-defined and any $\beta \in \mathbb{R}$ appears to minimize the expected least square error if you consider a rotation of the coordinate system. Any thoughts on why $\hat{y} = 0$ minimizes the least squares criterion and not $\hat{y} = \hat{\beta} \cdot x, \hat{\beta} \in \mathbb{R}$?

2

There are 2 best solutions below

0
On BEST ANSWER

As @hardmath mentioned in the comment, the results are perfectly logical. If $Y$ and $X$ are independent and each one is $\mathcal{N}(0,1)$, so clearly (from independence) $cov(X,Y)=0$ and the real intercept is $0$, because $(0,0)=(\mathbb{E}X, \mathbb{E}Y)$. Hence, the real regression line is simply $y=0+0 x+\epsilon=\epsilon$, where $\epsilon \sim \mathcal{N}(0,1)$ which coincides with the OLS results.

0
On

If the regression data are symmetric with respect to changing the sign of $y$, the least-squares approximation is the line $y=0$. The error is a sum of pairs $((y+a)^2 + (y-a)^2)$ all of which are minimized at $y=0$.

If the data are samples from a symmetric distribution then $y=0$ is the expected regression line and the actual line will be a small random perturbation of that.