A version of PCA which chooses regressors based on covariance between responses and X

81 Views Asked by Bumbble Comm At 28 Mar 2026 - 10:36

The PCA I've seen is to achieve dimensionality reduction by choosing the desired number of principal components in direction which maximizes the total variance of our design matrix $X$, or the direction which maximizes $X^TX$.

I'm thinking of a way to choose regressors that have the most significance by incorporating the response vector $y$. So let $x_{(.),1},\dots x_{(.),p}$ be the columns of $X$, and $\bar{x}_{(.),1},\dots \bar{x}_{(.),p}$ be their centered (mean $0$) versions. So I would want to do something like choose a first principal component $v_1$ such that $v_1$ is in the colspace of $X$ and maximizes $$v_1^T \frac{ \sum_{j=1}^p (y - \hat y)\bar x_{(.),j}^T }{\sum_{j=1}^p \| \bar x_{(.),j}\|} v_1~.$$

The reason that I would want something like this is so that I can find a linear transformation of the data such that the probability of my regressors $\beta_1 \dots \beta_p$ under the transformation is in order of significance. In other words, $\mathbb{P}(\beta_1 = 0) \leq \mathbb{P}(\beta_2 = 0) \leq \dots \leq \mathbb{P}(\beta_p) = 0$.

I don't know if something like this exists, or if this is a wrong angle to approach dimensionality reduction from.

Original Q&A

A version of PCA which chooses regressors based on covariance between responses and X

Related Questions in LINEAR-ALGEBRA

Related Questions in STATISTICS

Related Questions in COVARIANCE

Trending Questions

Popular # Hahtags

Popular Questions