Given a matrix $A = (X_1,X_2,...,X_n)$
How is it that $A^TA$ is the correlation matrix where $\frac{1}{n}(A^TA)_{ij} = Corr(X_i,X_j)$?
I am confused because $\frac{1}{n}(A^TA)_{ij} = \frac{1}{n}\sum_{k=1}^nA_{ki}A_{kj}$
and $\frac{1}{n}\sum_{k=1}^nA_{ki}A_{kj} = E[X_iX_j]$
but how does $Corr(X_i,X_j) = E[X_iX_j]$ ?
I know that $Cov(X_i,X_j) = E[X_iX_j] - \mu_i\mu_j$
but $Corr(X_i,X_j) = \frac{Cov(X_i,X_j)}{(Var(X_i)Var(X_j))^{\frac{1}{2}}}$
Wouldn't this imply the following?
$$Cov(X_i,X_j) + \mu_i\mu_j = \frac{Cov(X_i,X_j)}{(Var(X_i)Var(X_j))^{\frac{1}{2}}}$$
Unless I am just attempting to skip some serious algebra here, I am not sure what I'm missing here...
This is because \begin{equation} C_{X_i X_j} = \frac{Cov(X_i,X_j)}{(Var(X_i)Var(X_j))^{\frac{1}{2}}} \end{equation} is the correlation coefficient determined by dividing the covariance by the product of the variables standard deviations, while the correlation is \begin{equation} Corr(X_i,X_j)= E[X_i X_j] \end{equation} If $X_i$ and $X_j$ have zero mean, this is the same as the covariance which is defined as \begin{equation} Cov(X_i,X_j)= E[(X_i-\mu_{X_i}) (X_j-\mu_{X_j})] \end{equation} with $\mu_{X_i}$ and$\mu_{X_j}$ the mean of ${X_i}$ and ${X_j}$, respectively.