noise-free Gaussian Process likelyhood

69 Views Asked by At

I am learning Gaussian Process reading GPML. I am a bit confused with understanding the Bayesian analysis.

Let consider the standard linear regression model with "Gaussian noise", i.e, $$ f(\textbf{x}) = \textbf{x}^T\textbf{w}, \qquad y = f(\textbf{x}) + \epsilon, \qquad \epsilon \sim \mathcal{N}(0,\sigma_n^2) $$ where $\textbf{w}, \textbf{x} \in \mathbb{R}^n$.

Note that $$ \epsilon = y - f(\textbf{x}) = y - \textbf{x}^T\textbf{w} \sim \mathcal{N}(0,\sigma_n^2) $$ and the p.d.f. of $\epsilon$ is $$ p_\epsilon(z) = \frac{1}{\sqrt{2\pi}\sigma_n}\exp\left(-\frac{z^2}{2\sigma_n^2}\right). $$ Therefore, given $\textbf{x}$ and $\textbf{w}$, the p.d.f. of $y$ would be $$ p_{y\mid \textbf{x}, \textbf{w}}(z) = \frac{1}{\sqrt{2\pi}\sigma_n}\exp\left(-\frac{(z-\textbf{x}^T\textbf{w})^2}{2\sigma_n^2}\right). $$ It then follows from the Bayes' rule that $$ p_{\textbf{w}\mid y, \textbf{x}}(w) = \frac{p_{y\mid \textbf{x},\textbf{w}}(z)\cdot p_{\textbf{w}}(w)}{p_{y\mid \textbf{x}}(z)}. $$

Question: In the noise-free setup, how one can derive the posterior distribution $p_{\textbf{w}\mid y, \textbf{x}}(w)$? It seems that in the noise-free setup, $y$ is completely determined by $\textbf{x}$ and $\textbf{w}$. Thus $$p_{y\mid \textbf{x},\textbf{w}}(z) = \delta_{\{z=\textbf{x}^T\textbf{w}\}}(z), $$ a point distribution. Then this gives $$ p_{\textbf{w}\mid y, \textbf{x}}(w) = \frac{\delta_{\{z=\textbf{x}^T\textbf{w}\}}(z)}{P_{\textbf{w}}(\textbf{x}^T\textbf{w}=z \mid \textbf{x})}p_{\textbf{w}}(w) ? $$ (actually, not 100% sure about this).

At this point, I don't see how the Bayeian approach is somehow useful in the noise-free senario. Or Is Bayeian approach based on the assumption of the presence of the noise?

Any comments/answers/suggestions would be very appreciated. Thank you in advance.