Help understanding Gaussian processes on graphs

246 Views Asked by At

I am reading a paper (link) describing how to define a Gaussian process on a graph. I am seeking help understanding some of the mathematical derivations. A lot of this math is over my head.

For brevity, I am not defining every last bit of notation.

Background:

Given a weighted, undirected graph $G$, we have its Laplacian $\boldsymbol{\Delta}$, which is a symmetric real matrix. As such, it has the usual eigenvalue decomposition $\mathbf{\Delta} = \mathbf{U \Lambda U}^\intercal$.

A Gaussian process with Matérn kernel on $\mathbb R^d$ satisfies the stochastic PDE

\begin{equation} \left( \frac{2\nu}{\kappa^2} - \Delta\right)^{\frac{\nu}{2} + \frac{d}{4}}f = \mathcal W \tag{1} \label{eq:1} \end{equation}

where $\mathcal W$ is Gaussian white noise.

On the other hand, if we use the squared exponential kernel, then we have the SPDE

\begin{equation} e^{-\frac{\kappa^2}{4}\Delta}f = \mathcal W \tag{2} \label{eq:2} \end{equation}

where the exponential is the heat semigroup.

We want analogues of (\ref{eq:1}) and (\ref{eq:2}) for a Gaussian process on the graph $G$. So what the paper does is consider some (presumably smooth) function $\Phi \colon \mathbb R \to \mathbb R$, and then define

\begin{equation} \Phi(\mathbf{\Delta}) := \mathbf{U} \Phi(\mathbf{\Lambda}) \mathbf{U}^\intercal, \tag{3} \label{eq:3} \end{equation}

where $\Phi(\mathbf{\Lambda})$ is the application of $\Phi$ to the diagonal elements of $\mathbf{\Lambda}$.

Question:

The paper then says (modulo equation numbering):

Although such a definition may seem arbitrary, this generalization of $\Phi$ to square matrices can be interpreted intuitively as plugging $\mathbf{\Delta}$ into the Taylor expansion of $\Phi$, which immediately boils down to (\ref{eq:3}) due to the orthogonality of $\mathbf U$. Taking $\Phi$ to be one of \begin{equation} \Phi(\lambda) = \left( \frac{2\nu}{\kappa^2} + \lambda \right)^{\frac{\nu}{2}} \qquad \Phi(\lambda) = e^{\frac{\kappa^2}{4}\lambda} \end{equation}

gives the operators on the left-hand side of (\ref{eq:1}) and (\ref{eq:2}), respectively.

What's going on here? Can somebody please give this derivation? I don't know what it means to plug a matrix into a Taylor series, or any other part of the derivation.

1

There are 1 best solutions below

0
On

This is possibly easier than it looks. The graph $G$ is a finite set, which means that associated operators such as the Laplacian $\mathbf\Delta$ are just finite dimensional matrices.

The quoted part of the paper is trying to give a sensible/reasonable definition of $\Phi(\mathbf\Delta)$ for a smooth function $\Phi \colon \mathbb R \to \mathbb R$. What might such a 'reasonable' definition look like? Well, if $\Phi$ was a polynomial, say $\Phi(x)=x^3+x$, then we might hope to have $\Phi(\mathbf\Delta)=\mathbf\Delta^3+\mathbf\Delta$. Now, using the eigenvalue decomposition, the right hand side of this equation is equal to $$\mathbf{U\Lambda U^T U\Lambda U^T U\Lambda U^T+U\Lambda U^T}=\mathbf{U\Lambda^3 U^T+U\Lambda U^T}=\mathbf{U(\Lambda^3+\Lambda)U^T}=\mathbf{U}\Phi(\mathbf\Lambda)\mathbf{U^T}$$ because $\mathbf{U}$ is an orthogonal matrix. This clearly matches with the definition proposed in the paper. We have 'plugged' $\mathbf\Delta$ into the (finite) Taylor expansion of $\Phi$, namely $\Phi(x)=x^3+x$. The same procedure applies to any polynomial $\Phi$. It also applies for more general $\Phi$ (non-polynomial, but with a Taylor expansion), we just have to handle infinite sums rather than finite sums.