Linear Algebra Wizardry

Question

Linear Algebra Wizardry

96 Views Asked by Bumbble Comm At 04 Apr 2026 - 1:26

I am reading a textbook which features results on the multivariate normal distribution and the author writes out results with no justification, which take me a page to verify, and they end up being right. These claims don't seem obvious to me. I'll give an example.

Here's the setup:

Let $y_1,y_2,...,y_n, \mu \in \mathbb{R}^d$, let $A \in \mathbb{R}^{d \times d}$

Let $Y$ be the $n \times d$ matrix, formed by taking row $i$ to be $y_i$.

Let $e \in \mathbb{R}^d$ be a vector of 1's.

An example of a claim:

$\sum_{i}(y_i-\mu)^{T}A(y_i-\mu)=\text{trace}(A(Y-e\mu^{T})^{T}(Y-e\mu^{T}))$

This claim took me a long time to verify, but the author included zero justification and just followed it on from his last line of work. There are more examples of claims he makes which would take me a while to verify, but I feel like I might be missing a trick given the frequency of this happening.

Can anyone explain how either I can get better at doing these, or at least some verification that I'm not crazy and it's not super trivial?

Many thanks.

Original Q&A

There are 3 best solutions below

Bumbble Comm On 03 Mar 2020 - 9:21

Can anyone explain how either I can get better at doing these

The same way you get better at doing anything else - by practice.

or at least some verification that I'm not crazy and it's not super trivial?

It's not trivial at the slightest, but that's not the criterion. It's not about "is it trivial", it's about "is it interesting enough".

Let me explain.

The book you read was not about linear algebra, it was about something else, in your case, it sounds like probability. When writing a book about one subject, it is usually best if the author includes as little of other subjects as possible, while still keeping the book understandable. The derivation of the claim you cite is a purely algebraic manipulation that the author did not consider interesting enough to include.

Bumbble Comm On 03 Mar 2020 - 9:23

Being familiar with some properties of trace might help.

We know that $Tr(AB)=Tr(BA)$ and $Tr(A+B)=Tr(A)+Tr(B)$

\begin{align} \sum_i (y_i - \mu)^TA(y_i - \mu) &= \sum_i Tr((y_i - \mu)^TA(y_i - \mu)) \\ &=\sum_iTr(A(y_i - \mu)(y_i - \mu)^T) \\ &=Tr(\sum_iA(y_i - \mu)(y_i - \mu)^T) \\ &=Tr(A\sum_i(y_i - \mu)(y_i - \mu)^T)\\ &=Tr(A\left[y_1-\mu, \ldots y_n-\mu \right]\left[y_1^T-\mu^T, \ldots y_n^T-\mu^T \right]^T)\\ &=Tr(A(Y-e\mu^T)^T(Y-e\mu)) \end{align}

**Bumbble Comm** · Accepted Answer

Here is a short derivation of the identity in question.

$$ \sum_{i}(y_i-\mu)^{T}A(y_i-\mu) = \sum_{i}\operatorname{tr}[(y_i-\mu)^{T}A(y_i-\mu)]\\ = \sum_i [\operatorname{tr}A(y_i-\mu)(y_i-\mu)^{T}] = \operatorname{tr}\left[ A\sum_i(y_i-\mu)(y_i-\mu)^{T}\right]. $$ Now, note that $$ \sum_i(y_i-\mu)(y_i-\mu)^{T} = \pmatrix{y_1 - \mu & \cdots & y_n - \mu} \pmatrix{(y_1 - \mu)^T \\ \vdots \\ (y_n - \mu)^T} = (Y - e\mu^T)^T(Y - e\mu^T). $$

Linear Algebra Wizardry

There are 3 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in NORMAL-DISTRIBUTION

Related Questions in MATRIX-EQUATIONS

Trending Questions

Popular # Hahtags

Popular Questions