I'm trying to learn about kernel PCA by reading through the paper of it's creators (I assume) "Nonlinear Component Analysis as a Kernel Eigenvalue Problem", Bernhard Schölkopf, Alexander Smola, Klaus-Robert Müller, Technical Report No 44, 1996
I don't understand the part where (page 3 of the above pdf) if you combine the equations (7) and (8) you get the (9), that is:
if
$$\lambda (\Phi(x_k)\cdot \mathbf V)=(\Phi(x_k)\cdot \bar C\mathbf V)\; \text{for all $k=1,\ldots, M $}$$
and
$$ \mathbf V = \sum_{i=1}^M a_i\Phi(x_i) $$
we get
$$ \lambda \sum_{i=1}^M a_i(\Phi(x_k)\cdot \Phi(x_i)) = \frac 1 M \sum_{i=1}^M a_i(\Phi(x_k)\cdot \sum_{j=1}^M \Phi(x_j))(\Phi(x_j) \cdot \Phi(x_i)) $$
using the covariance matrix $\bar C$ in the feature space for our $M$ centered observations: $$ \bar C = \frac 1 M \sum_{j=1}^M \Phi(x_j) \Phi(x_j)^\mathsf T $$
What happened to the transposed $\Phi(x_j)^\mathsf T$ inside the sum in the covariance matrix?