I'm trying to wrap my head around Gaussian random vectors, in particular their covariance matrices. I've seen two different definitions of a covariance matrix depending on whether you are calculating the matrix from observed results or from a mathematical model. If I let a Gaussian random vector be defined as such. $$X = \mu + AZ$$ Where $\mu \in \mathbb{R}^n$, $A \in M_{n,k}$ and $Z = \{Z_0, ..., Z_k\}^T$ where $Z_i$ are independent and identically distributed standard normal random variables.
I've seen that one definition of the covariance matrix $\Sigma$ is $AA^T$. And another is $\Sigma_{i,j} = Cov[X_i, X_j] = E[(X_i-\mu_i)(X_j-\mu_j)]$. I attempted to check that these are equivalent with an example. $$X = \begin{align}\begin{bmatrix}1 \\ 2\end{bmatrix} + \begin{bmatrix}1 & 2 \\ 3 & 4\end{bmatrix}\begin{bmatrix}Z_0 \\ Z_1\end{bmatrix}\end{align}$$ I found, $$\Sigma = AA^T = \begin{align}\begin{bmatrix}5 & 11 \\ 11 & 25\end{bmatrix}\end{align}$$ And, $$\Sigma_{0,1} = E[(X_0- \mu_0)(X_1 - \mu_1)] = E[(Z_0 + 2Z_1)(3Z_0+4Z_1)] = E[3Z_0^2 + 10Z_0Z_1 + 8Z_1^2]$$ Now I have no inclination as to if this makes any sense or why it might be true but I saw that if I let $E[Z_iZ_j] = \delta_{i,j}$ (Kronecker delta). Then we successfully find $\Sigma_{0,1} = 11$.
Could someone please provide more clarity on what is actually happening here?