Inverse of kernel integral operator of Gaussian squared exponential kernel

122 Views Asked by At

I am doing research related Gaussian processes and Gaussian process regression. What I would like to know is the inverse of the integral operator of the squared exponential kernel in one dimension, $K : \mathbb{R} \times \mathbb{R} \to \mathbb{R}$, where the finite measure is the measure $d\mu(x) = p(x) dx$ where $p(x)$ is the Gaussian density $\mathcal{N}(0,\sigma_p^2)$: $$ K(x,y) = \sigma^2 \exp\left(-\frac{(x-y)^2}{2\ell^2}\right), \quad p(x) = \frac{1}{\sqrt{2\pi \sigma_p^2}} \exp\left(-\frac{x^2}{2\sigma_p^2}\right) $$ where $\ell > 0$ and $\sigma > 0$ are parameters of the squared exponential kernel.

Associated with this kernel is the integral operator defined as $$ (T_K f)(x) = \int_{-\infty}^\infty K(x,y) f(y) p(y) dy. $$

What I would like to calculate is the following quantity: $$ K^{-1}(x,y) = \sum_{k=0}^\infty \frac{1}{\lambda_k} \phi_k(x) \phi_k(y) $$ for any $(x,y) \in \mathbb{R}^2$ where $\phi_k$ and $\lambda_k$ is the $k$th eigenfunction and $k$th eigenvalue of $T_K$, respectively, and for all $j,k$, $\int_{-\infty}^\infty \phi_j(x) \phi_k(x) p(x) dx = \delta_{jk}$ and $\lambda_k > 0$.

(Note: I believe that the above should be the inverse (see Integral Transform and Zhu et al) of this kernel, that is, the kernel $K^{-1}$ such that $T_{K^{-1}} T_K = I$, because the eigenvalues of the inverse of an operator are the inverses of the eigenvalues of the original operator, and the eigenfunctions are the same, so this is due to Mercer's theorem. Therefore, it also suffices to directly compute $K^{-1}(x,y)$.)

I already know $\phi_k,\lambda_k$: as shown in the Gaussian processes for machine learning book, page 97 (which the authors slightly adapted from the result by Zhu et al), the eigenfunctions and eigenvalues of $T_K$ with this measure are as follows (starting from $k=0$): $$ \lambda_k = \sigma^2 \sqrt{\frac{2a}{A}} B^k, \quad \phi_k(x) = \left(\frac{c}{a}\right)^{1/4} \frac{1}{\sqrt{2^k k!}} \exp\left(-(c-a)x^2\right) H_k(\sqrt{2c}x), $$ where $H_k$ is the $k$th physicist's Hermite polynomial, and $$ a = \frac{1}{4\sigma_p^2}, \quad b = \frac{1}{2\ell^2}, \quad c = \sqrt{a^2+2ab}, \quad A = a+b+c, \quad B = b/A, $$ and my eigenfunctions and eigenvalues differ from those from the book in the following two ways: firstly, I have added the $\sigma^2$ factor to the eigenvalues because I added the scale parameter $\sigma^2$ to the kernel $K$ for the sake of generalizability (and hence had to re-name their $\sigma$ to $\sigma_p$; secondly, since the given eigenfunctions are not normalized, I divided the $\phi_k$ by their norms which can be easily calculated by using the equation here (it yields $\int_{-\infty}^\infty \phi_m(x) \phi_n(x) p(x) dx = \sqrt{\frac{a}{c}} 2^n n! \delta_{mn}$ where the $\phi$ here are those given in the book).

We then have, $$ p(x) = \sqrt{\frac{2a}{\pi}} \exp(-2ax^2), \quad K(x,y) = \sigma^2 \exp\left(-b(x-y)^2\right). $$

By Mercer's theorem, we have the following: $$ K(x,y) = \sum_{k=0}^\infty \lambda_k \phi_k(x) \phi_k(y). $$ I was able to verify that this does indeed hold by using the "completeness relation" which says that $$ \sum _{n=0}^{\infty }{\frac {H_{n}(x)H_{n}(y)}{n!}}\left({\frac {u}{2}}\right)^{n}={\frac {1}{\sqrt {1-u^{2}}}}e^{{\frac {2u}{1+u}}xy-{\frac {u^{2}}{1-u^{2}}}(x-y)^{2}}, \quad \forall u \in (-1,1) $$ where I was able to use it because $B$ is the $u$ when you write it out, and we can see that $0 < B < 1$, so it holds that $-1<B<1$ (the result is complicated when you write it out but I was able to simplify it using Mathematica and verify that Mercer's theorem holds).

But when I repeated the same procedure to find the inverse (the desired quantity), we get $(1/B)^k$ instead of $B^k$ (i.e. $u=1/B$ instead of $u=B$), but $1/B > 1$, so we can't apply the result. So what happens now? Does it not converge? I did some computations numerically and it seems to be rapidly increasing in magnitude and periodically alternating in sign. So it appears not to converge. Or maybe we can use results similar to those here to directly compute the inverse?

Also, see here, here, here, here.

Thank you in advance!!