$L^2$ Approximation error in Gaussian Process Regression (finite data setting)

37 Views Asked by At

I am learning about Gaussian Process Regression. I would like to have some references or results regarding the distribution of the error between a given function, and the posterior obtained in Gaussian Process Regression when I have a finite dataset. I am not sure whether the question fits best here or in cross-validated, sorry if this is the wrong place!

To set some notations simply and a bit informally, I consider the setting where we have a function $f$ over an input space $D$ and a Gaussian Process $(Z_x)_{x\in D}$ with kernel $k$ and mean $m$. I know that if I have $n$ datapoints stemming from the model (i.e. pairs of location/observations $\mathcal D_n := \{(x_i, z_i), 1\leq i \leq n\}$ I can perform Gaussian Process Regression (with variations depending on whether I have noisy observations or not, and whether I assume my mean to be known or not (simple VS universal Kriging)). If I do so, I end up with a probabilistic predictor $\hat Z := Z \vert \mathcal D_n$ that is also Gaussian.

I want to learn more about the following quantities (assuming that they are well defined):

$$\Vert f - \hat Z\Vert_\infty :=\sup_{x\in D} \vert f(x)-\hat Z_x \vert$$

$$\Vert f - \hat Z\Vert_{2, P}^2 :=\int_D ( f(x)-\hat Z_x )^2 \,dP(x)$$ (for $P$ a given Borel measure)

In particular, I want to know more about what happens when $n$ is fixed, rather than focusing on the asymptotic setting:

  • Do you have some papers/books I can refer to ?

  • I am tempted to say that the distribution of this quantity should related to generalized Chi-squared, but I have trouble finding a reference about it ?

  • In practice is there any Monte-Carlo approach to estimating moments/quantiles of these errors ?

Thank you very much in advance!