In a book I am reading the following is written during an overview of principle component analysis:
"Given a set of observations $x_i \in \mathbb{R}^n$, $i = 1,...,m$ which are centred, $\sum x_i = 0$, PCA finds the principle axes by diagonalizing the covariance matrix $C = \frac{1}{m} \sum_{j = 1}^m x_j x_j^T$..."
I.e they say $C = \frac{1}{m} \sum_{j} \langle x_j, x_j \rangle$ which I do not understand, the RHS is a scalar quantity. Does anyone know what is meant by this notation? The book is "Learning with Kernels".
Here $x_j$ is a column vector. $x_jx_j^T$ is a matrix.
It is not the same as $\langle x_j, x_j \rangle$.