Should the data matrix associated with a random vector be transposed?

24 Views Asked by Bumbble Comm At 31 Mar 2026 - 5:39

Given a $p$-dimensional zero mean random (column) vector $X \in \mathbb{R}^{p}$, the auto-covariance is given by $$\Sigma = Cov(X) = \mathbb{E}[XX^T].$$ Now given $n$ observations of this column vector, it feels natural to stack each $p$-dimensional observation as new column vectors, i.e. create the data matrix $M \in \mathbb{R}^{p\times n}$, such that the unbiased sample covariance is $\hat{\Sigma} = \tfrac{1}{n-1}MM^T$.

However, in the literature, it is common to arrange variables as columns and observations as rows, i.e. $M \in \mathbb{R}^{n\times p}$, such that $\hat{\Sigma} = \tfrac{1}{n-1}M^TM$. The benefit of this is that it conforms well with the design matrix used in regression analysis. However, it seems slightly inconsistent with the random vector notation.

While I do understand it is a choice of application, what would you say is the ideal way of arranging the matrix $M$, observations as columns or as rows in order to keep things consistent?

Original Q&A

Should the data matrix associated with a random vector be transposed?

Related Questions in LINEAR-ALGEBRA

Related Questions in PROBABILITY

Related Questions in MATRICES

Related Questions in NOTATION

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions