Orthonormal Basis assumption in PCA derivation

34 Views Asked by At

I'm doing the Mathematics for Machine Learning course on Coursera (Course 3, Week 4). I am trying to understand the derivation of PCA.

Specifically from:

$J =\frac{1}{N} \sum_{n=1}^{N}\Vert \sum_{j=M+1}^{D}(b_j^TX_n)b_j\Vert^2$ to

$J =\frac{1}{N} \sum_{n=1}^{N} \sum_{j=M+1}^{D}(b_j^TX_n)^2$

Why does the trailing $b_j$ disappear?

Apparently it's due to b being a vector in the orthonormal basis. But can anyone help me understand. TIA.

Additional info:

$b$- vectors forming an Orthonormal basis.

$J$- Average reconstruction error.

$X_n$ - A datapoint

enter image description here