Bayesian curve fitting vectorized mean calculation

211 Views Asked by At

In the book "Pattern Recognition and Machine Learning" by Bishop, there is a section on Bayesian curve fitting where he gives some calculations in figures 1.69, 1.70, 1.71, and 1.72 which I found hard to follow without a bit more explanation.

Similarly, the integra- tion in (1.68) can also be performed analytically with the result that the predictive distribution is given by a Gaussian of the form ...

$$ p(t|x, \pmb{x}, \pmb{t}) = N(t|m(x), s^2(x)) $$

where the mean and variance are given by

$$ \begin{aligned} &m(x) = \beta \, \phi(x)^T S \sum_{n=1}^N \phi(x_n)t_n \\ &s^2(x) = \beta^{-1} + \phi(x)^T S \, \phi(x)^T \end{aligned} $$

here the matrix S is given by

$$ S^{-1} = \alpha\pmb{I} + \beta \sum_{n=1}^N \phi(x_n)\phi(x)^T $$

where I is the unit matrix, and we have defined the vector $\phi(x)$ with elements $\phi(x) = x_i for i = 0,...,M$.

How can I intuitively understand why these represent the mean and variance? I am having some trouble coming up with the proper search topics to get me there.

1

There are 1 best solutions below

0
On

The main technique used in Gaussian computation is completing the square. The idea is simple: if something is an exponential of a quadratic form, then it is a Gaussian distribution.

Consider an expression such as $$\exp\{ x^T A x + 2 x^T b\}.$$

We can complete the square adding and subtracting the term in $b$ that is missing to create a complete quadratic form: $$\exp\{ x^T A x + 2 x^TA A^{-1} b + b^T A^{-1}b - b^TA^{-1} b\};$$ From which we get $$\propto\exp\{(x -A^{-1}b)^T A(x -A^{-1}b)\},$$ where we recognize a Gaussian density with covariance matrix $A$ and mean $A^{-1}b$.

This fundamental technique is used in almost all Gaussian computations in PRML.