From All of Statistics by Wasserman, 2nd Edition:
There are two ways I can interpret how $\mu$ is defined:
The $\mu$ defined is the expected value of a single one of the $X_i$ vectors. But then $\overline{X} - \mu$ is subtracting the means of 1 specific random vector from the sample mean of every other random vector, which makes no sense.
The $\mu$ defined is actually the expected values of each of the different $X_i$ vectors, but then I'd expect it to be defined as $\mathbb{E}(X_1), \ldots, \mathbb{E}(X_n)$ instead of $\mathbb{E}(X_{1i}), \ldots, \mathbb{E}(X_{ni})$. Even if I assume that's what Wasserman meant though, $\mathbb{E}(X_1)$ is itself a vector, and $\overline{X} - \mu$ is undefined because you can't subtract a vector of vectors from a vector.
Am I missing something? Basically, I don't see how this type checks.


The definition of $\mu$ is correct. It is a vector and equal to $\mathbb{E}[X_1]$ (note that $X_1$ is a vector of length $k$).
Perhaps you are misunderstanding the definition of $\bar{X}$. It is also a vector of length $k$, and obtained by summing the vectors $X_1, \ldots, X_n$ and dividing by $n$.
Since $\bar{X}$ and $\mu$ are both vectors of length $k$, the subtraction makes sense.
The $j$th entry of $\bar{X} - \mu$ is $\bar{X}_j - \mu_j = \frac{1}{n} \sum_{i=1}^n X_{ji} - \mathbb{E}[X_{j1}]$.