Understanding the proof of PCA

99 Views Asked by At

I am currently watching the statistical machine learning course by uni tübingen on youtube, and I saw the the following slide. Could someone please explain to me how to get from step 2 to step 3, and maybe also, the dimension of matrix X and vector x_i here? Thank you very much! The proof is like the following:

$\max\limits_{a \in \mathbb{R}^d, \Vert a \Vert = 1} Var(\pi_a(X))$

-> $\max\limits_{a \in \mathbb{R}^d} \sum\limits^n_{i = 1} (\pi_a(x_i)^2)$ subject to $a^t a = 1$

-> $\max\limits_{a \in \mathbb{R}^d} \sum\limits^n_{i = 1} (a^t x_i)^2$ subject to $a^t a = 1$

-> (how does this step work?)

$\max\limits_{a \in \mathbb{R}^d} \Vert Xa \Vert ^2$ subject to $a^t a = 1$

enter image description here

1

There are 1 best solutions below

10
On BEST ANSWER

X: (n,d), $x_i$: (d,1), a: (d,1), $a^t$: (1,d)

$\left\lVert Xa\right\rVert^2$= $(Xa)^t (Xa)$ = $(a^t X^t )(Xa)$ = $a^t X^t Xa$ = ... =$\sum\limits^n_{i = 1} (a^t x_i)^2$

The remaining part is the matrix multiplication.

$X^t= \begin{pmatrix} x_1 & x_2 & \cdots & x_n \end{pmatrix} $, $X= \begin{pmatrix} x_1^t \\ x_2^t \\ \vdots \\ x_n^t \\ \end{pmatrix}$

Keystep: $X^t X$ is the sum of columns of $X^t$ and rows of X.

$X^t X=\sum\limits^n_{i = 1} x_i x_i^t$