Does PCA always have to reduce dimensionality?

437 Views Asked by At

I came across this paper where the authors implement a regularized learning model to estimate the covariance matrix of a dataset. The authors say they "...propose a regularized form of Principal Component Analysis (PCA)..."

However, from what I can gather from the paper, ultimately the model predicts the full covariance matrix of S&P500 (time series) price data i.e. the model is trained on data up until day T and then attempts to predict the covariance matrix of the data until T+10. No dimensionality reduction has taken place.

Granted, the paper does show how a factor model was learned to be able to predict future covariances between stock prices. What I don't understand though is how this is related to PCA - I always thought PCA involved reducing the dimensionality of a dataset / matrix.

Perhaps I am getting confused with terminology. Does PCA always involve reducing the dimensionality of a dataset?

1

There are 1 best solutions below

0
On

The PCA is a method utilized for reducing dimensionality however if the matrix is full rank then you wouldn't have any dimensionality reduction. In the paper you mention if you read on page 3 you have the following

We consider the problem of learning a factor model without knowledge of the number of factors. Specifically, we want to estimate a $M × M$ covariance matrix $Σ∗$ from samples $x(1), . . . , x(N) ∼ N (0, Σ∗)$, where $Σ∗$ is the sum of a symmetric matrix $F∗ ≽ 0$ and a diagonal matrix $R∗ ≽ 0.$ These samples can be thought of as generated by a factor model of the form $x(n) = F 1 2 ∗ z(n) + w(n)$ , where $z(n) ∼ N (0, I)$ represents a set of common factors and $w(n) ∼ N (0, R∗)$ represents residual noise. The number of factors is represented by $rank(F∗)$, and it is usually assumed to be much smaller than the dimension $M$.

Granted, the paper does show how a factor model was learned to be able to predict future covariances between stock prices. What I don't understand though is how this is related to PCA - I always thought PCA involved reducing the dimensionality of a dataset / matrix.

If you know what the PCA is you'd realize it is generated from computing the covariance matrix then typically choosing $K$ factors from this covariance matrix which are responsible for $90$% of the variance. There are Krylov based methods which wouldn't construct a full matrix.