Need assistance with this lecture!

74 Views Asked by At

I need help with equation 2.7 on page 9 from Gaussian processes for machine learning.

I understand the first part: $p(\mathbf{w}|\mathbf X,\mathbf y) \propto exp(\frac{1}{2\sigma^2_n}(\mathbf y-\mathbf X^T\mathbf w)^T(\mathbf y-\mathbf X^T\mathbf w))exp(-\frac{1}{2}\mathbf w^T \mathbf \Sigma^{-1}_p \mathbf w)$, but I can't pass from this to: $p(\mathbf w|\mathbf X,\mathbf y) \propto exp(-\frac{1}{2}(\mathbf w-\bar{\mathbf w})^T(\frac{1}{\sigma^2_n}\mathbf X\mathbf X^T+ \mathbf \Sigma^{-1}_p)(\mathbf w-\bar{\mathbf w}))$, where $\bar{\mathbf w}=\sigma^{-2}_n(\sigma^{-2}_n\mathbf X\mathbf X^T+\mathbf \Sigma^{-1}_p)^{-1}\mathbf X\mathbf y$

1

There are 1 best solutions below

0
On BEST ANSWER

\begin{align} \mathbf{P}(\mathbf{w}|\mathbf{X},\mathbf{y}) &\propto exp(\frac{-1}{2\sigma^2_n}(\mathbf{y}-\mathbf{X}^T\mathbf{w})^T(\mathbf{X}^T\mathbf{w})) \cdot exp(-\frac{1}{2}\mathbf{w}^T\mathbf{\Sigma^{-1}_p}\mathbf{w}) \\ &\propto exp(\frac{1}{2\sigma^2_n}-2\mathbf{y}^T\mathbf{X}^T\mathbf{w}+(\mathbf{X}^T\mathbf{w})^T\mathbf{X}^T\mathbf{w}-\frac{1}{2}\mathbf{w}^T\mathbf{\Sigma^{-1}_p}\mathbf{w}) \\ &\propto exp(-\frac{1}{2\sigma^2_n}\mathbf{\mathbf{y}^T\mathbf{X}^T}\mathbf{w}-\frac{1}{2\sigma^2_n}(\mathbf{X}^T\mathbf{w})^T\mathbf{X}^T\mathbf{w}-\frac{1}{2}\mathbf{w}^T\mathbf{\Sigma^{-1}_p}\mathbf{w}) \\ &\propto exp(-\frac{1}{\sigma^2_n}\mathbf{y}^T\mathbf{X}^T\mathbf{w}-\frac{1}{2}\mathbf{w}^T(\frac{\mathbf{X}^T\mathbf{X}}{\sigma^2_n}+\mathbf{\Sigma}^{-1}_p)\mathbf{w}) \end{align}

Then derivate w.r.t. $\mathbf{w}$:

\begin{align} \frac{\partial log(\mathbf{P}(\mathbf{w}|\mathbf{X},\mathbf{y}))}{\partial \mathbf{w}} \propto \frac{\mathbf{y}^T\mathbf{X}^T}{\sigma^2_n}-\mathbf{w}^T(\frac{\mathbf{X}^T\mathbf{X}}{\sigma^2_n}+\mathbf{\Sigma^{-1}_p}) \\ \end{align}

And solve for $\mathbf{w}$:

\begin{align} \mathbf{w}=\frac{\mathbf{y}^T\mathbf{X}^T}{\sigma^2_n}(\frac{\mathbf{X}^T\mathbf{X}}{\sigma^2_n}+\mathbf{\Sigma^{-1}_p})^{-1} \end{align}

Finally, we get the posterior distribution for $\mathbf{w}$, where the variance come from the fourth line in the first expression:

\begin{align} \mathbf{P}(\mathbf{w}|\mathbf{X},\mathbf{y}) \sim \mathcal{N}(\sigma^{-2}_n(\sigma^{-2}_n\mathbf{X}^T\mathbf{X}+\mathbf{\Sigma^{-1}_p})^{-1}\mathbf{X}^T\mathbf{y}, (\sigma^{-2}_n\mathbf{X}^T\mathbf{X}+\mathbf{\Sigma^{-1}_p})^{-1}) \end{align}

This can also be expressed as:

\begin{align} \mathbf{P}(\mathbf{w}|\mathbf{X},\mathbf{y}) \sim \mathcal{N}(\mathbf{X}^T\mathbf{X}+\sigma^2_n\mathbf{\Sigma^{-1}_p})^{-1}\mathbf{X}^T\mathbf{y}, \sigma^{2}_n(\mathbf{X}^T\mathbf{X}+\sigma^{2}_n\mathbf{\Sigma^{-1}_p})^{-1}) \end{align}

I hope this helps.