Can linear dependent rows be removed in Linear Regression?

59 Views Asked by Bumbble Comm At 10 May 2026 - 12:51

Concerning the multiple linear regression, if we have n features and m inputs, so that m > n ($A_{mxn}$), is it possible to exclude some linear dependent inputs without compromising the model? Why?

Is it a problem to have dependent inputs?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 05 Apr 2018 - 5:09 BEST ANSWER

No. Unlike linearly dependent columns that means that you simply recorded the same information ( $=$ linearly transformed variable), identical rows means identical observations. E.g., you may observe some periodic phenomenon, as such you may observe regularly same observations every certain time period. Discarding them is the same is discarding the main "signal" in the data. A more technical aspect of identical observations is if that under the alternative hypothesis the model significance ($p.value$ and $F_{stat}$) are monotonically non-increasing end non-decreasing functions (respectively) of the number of repetition. Namely, $$ F_{stat} = \frac{\sum_{i=1}^{n_1} ( \hat{y}_i - \bar{y})^2/p }{\sum_{i=1}^{n_1} ( \hat{y}_i - y_i)^2/(n_1-p-1)} = \frac{MSReg(n_1)}{MSE(n_1)}, $$
hence if for some $N \in \mathbb{N}$ and $n_2 > n_1 > N$ where you have added duplicates of existing observations, $$ MSReg(n_2) =\sum_{i=1}^{n_2} ( \hat{y}_i - \bar{y})^2/p \ge \sum_{i=1}^{n_1} ( \hat{y}_i - \bar{y})^2/p = MSReg(n_1) , $$ while $$ MSE(n_2) = \hat{\sigma}^2_{n_2} \approx \sum_{i=1}^{n_1} ( \hat{y}_i - y_i)^2/(n_1-p-1) = \hat{\sigma}^2_{n_1} = MSE(n_2). $$

Can linear dependent rows be removed in Linear Regression?

There are 1 best solutions below

Related Questions in REGRESSION

Related Questions in DATA-ANALYSIS

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions