Using principal component analysis to reduce dimension, how can we know the distance between data points is reduced and features are uncorrelated?

Question

Using principal component analysis to reduce dimension, how can we know the distance between data points is reduced and features are uncorrelated?

95 Views Asked by Bumbble Comm At 25 Mar 2026 - 10:11

Using PCA, if we reduce the dimension of a dataset $x_1, \dots, x_n \in \mathbb{R}^d$ of mean zero, then we can get a dimensionally reduced dataset $y_1, \dots , y_n \in \mathbb{R}^k$, for some $1\leq k \leq d$

How can we know that PCA shrinks the dataset between data points? And how can we know that new features are uncorrelated?

My idea:
To show the distance: we can show that $\|y_i - y_j\| \leq \|x_i - x_j\|$
And for $f^{(i)} = (y_{1,i}, y_{2,i}, \dots , y_{n,i}) \in \mathbb{R}^n$ where $ 1 \in \{1,\dots,k\}$ we want to show that for all $i \neq j, f^{(i)} \perp f^{(j)}$

We know that $y_i = \langle x_i,e_j \rangle$
$\|y_i - y_j\| =\sqrt{\sum_{i=1}^n(y_i-y_j)^2}$

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The $n \times k$ matrix $Y$ (whose rows are $y_1, \ldots, y_n$) is $Y = XV_{1:k}$ where the columns of $V_{1:k} \in \mathbb{R}^{d \times k}$ are the orthonormal eigenvectors of $X^\top X$ corresponding to the largest $k$ eigenvalues.

Note that $Y^\top Y = V_{1:k}^\top X^\top X V_{1:k}$. The left-hand side is a matrix whose $(i,j)$ entry is $\langle f^{(i)}, f^{(j)}\rangle$. The right-hand side is a $k \times k$ diagonal matrix whose diagonal entries are the largest $k$ eigenvalues of $X^\top X$. Since the matrix is diagonal, we have $f^{(i)} \perp f^{(j)}$ for $i \ne j$.
$y_i = x_i^\top V_{1:k}$, so $$\|y_i - y_j\|^2 = \|(x_i - x_j)^\top V_{1:k}\|^2 = (x_i - x_j)^\top V_{1:k} V_{1:k}^\top (x_i - x_j) = \sum_{m=1}^k (x_{i,m} - x_{j,m})^2 \le \sum_{m=1}^d (x_{i,m} - x_{j,m})^2 = \|x_i - x_j\|^2.$$ (Above, we noted that $V_{1:k} V_{1:k}^\top$ is a $d \times d$ diagonal matrix with $k$ ones and $d-k$ zeros on the diagonal.)

Using principal component analysis to reduce dimension, how can we know the distance between data points is reduced and features are uncorrelated?

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in PRINCIPAL-COMPONENT-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions