Help in understanding Derivation of Posterior in Gaussian Process

Question

Help in understanding Derivation of Posterior in Gaussian Process

443 Views Asked by Bumbble Comm At 01 Apr 2026 - 5:03

According to the textbook Gaussian Process in Machine Learning, it is given that \begin{align*} p(w\mid X,y) &\propto \exp\left(-\frac{1}{2\sigma_n^2}(y-X^Tw)(y-X^Tw)\right)\exp\left(-\frac{1}{2}w^T\Sigma_{p}^{-1}w\right) \\ &\propto \exp\left(-\frac{1}{2}(w-\bar{w})^T\left(\frac{1}{\sigma_n^2}XX^T + \Sigma_p^{-1}\right)(w-\bar{w})\right) \end{align*} where $\bar{w} = \sigma_n^{-2}(\sigma_n^{-2}XX^T + \Sigma_p^{-1})^{-1}Xy$.

I can't really understand how the first step leads to the second step. Can someone kindly show me how the derivation is done? Thanks

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Aug 2016 - 6:49

What did you try yourself? The steps involved are:

expand the quadratic term (easiest in log domain to get rid of the exp...)
gather all terms which involve y (they will form $\bar{w}$)
create the quadratic term using the technique of "completing the squares"
afterwards you have a superfluous quadratic term that is not depending on $w$. this will be swallowed by the $\propto$

you can as well take a look at the standard example of multiplying two gaussian distributions as this is essentially the same.

**Bumbble Comm** · Accepted Answer

You need to show \begin{align} & \frac1{\sigma_n^2}(y-X^Tw)^T(y-X^Tw) + w^T\Sigma_{p}^{-1}w \\[10pt] = {} & (w-\bar{w})^T\left(\frac{1}{\sigma_n^2}XX^T + \Sigma_p^{-1}\right)(w-\bar{w}) + \text{constant} \end{align} and bear in mind that "constant" means not depending on $w.$

You had a typographical error: $(y-X^Tw)^T$ was needed where you have $y-X^Tw$.

You need this: \begin{align} & \frac1{\sigma_n^2}(y-X^Tw)^T(y-X^Tw) + w^T\Sigma_{p}^{-1}w \\[10pt] = {} & \frac 1 {\sigma_n^2} \left( y^Ty - y^T X^T w - w^T Xy + w^TXX^T w \right) + w^T\Sigma_p^{-1} w \\[10pt] = {} & w^T A w - b^T w - w^T b + \text{constant} \tag 1 \\[10pt] \overset{\Large\text{?}}= {} & (w-\bar{w})^T\left(\frac{1}{\sigma_n^2}XX^T + \Sigma_p^{-1}\right)(w-\bar{w}) + \text{constant} \end{align} where $$ A = \frac 1 {\sigma_n^2} X^T X + \Sigma_p^{-1} \quad \text{and} \quad b = Xy. $$

So the question is: How do you complete the square in an expression like $(1)$?

Here we need the fact that the matrix $A$ is a nonnegative-definite symmetric matrix with real entries, and that such matrices can be diagonalized by orthogonal matrices, and the diagonal entries (which are the eigenvalues) are nonnegative, and by taking square roots of the diagonal entries one can find a nonnegative-definite symmetric square root of $A$, which let us call $A^{1/2}$.

Here I will assume $X$ is a matrix with linearly independent rows (and of course it typically has more rows than columns). It follows that $A$ and $A^{1/2}$ are invertible, so we may speak of $A^{-1/2}$, which is also a positive-definite symmetric matrix.

Then we have \begin{align} & w^T A w -b^T w - w^T b \\[10pt] = {} & (A^{1/2} w)^T (A^{1/2} w) - (A^{-1/2}b)^T (A^{1/2}w) - (A^{1/2} w)^T (A^{-1/2} b) \\[15pt] \text{and so } & (A^{1/2} w)^T (A^{1/2} w) - (A^{-1/2}b)^T (A^{1/2}w) - (A^{1/2} w)^T (A^{-1/2} b) + b^T A^{-1} b \\[10pt] = {} & (A^{1/2} w - A^{-1/2} b)^T (A^{1/2} w - A^{-1/2} b). \end{align}

Thereafter proceed according to the incomplete answer by Ulfgard.

Help in understanding Derivation of Posterior in Gaussian Process

There are 2 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions