Under what kernels and/or conditions does $k(x, x) = k(x, X) k(X, X)^{-1} k(X, x)$?

Question

Under what kernels and/or conditions does $k(x, x) = k(x, X) k(X, X)^{-1} k(X, x)$?

56 Views Asked by Bumbble Comm At 11 May 2026 - 8:00

This question is motivated by a question I'm facing in vector-valued kernel methods (also known as Gaussian Processes and co-krieging).

Suppose I have $N$ data $X := \{x_n\}_{n=1}^N$ , where each $x_n \in \mathbb{R}^D$. My question: Under what conditions or choices of kernel functions will the following hold?

$$k(x, X) \; k(X, X)^{-1} \; k(X, x) = k(x, x)$$

For example, I think the following is true: If we choose kernel $k(\cdot, \cdot)$ as a vanilla inner (i.e. dot) product and if I slightly abuse notation by referring to $X$ as a matrix in $\mathbb{R}^{N \times D}$, then we have:

$$k(x, X) \; k(X, X)^{-1} \; k(X, x) = x^T X^T (X X^T)^{-1} X x$$

and we know that this will simplify to $x^T x = k(x, x)$ iff $x$ lives in the row space of $X$.

Is this correct?
Are more general versions of this result possible?

I'd be happy to take clarifying questions!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2024-01-14 12:15:20

In general, you can think of kernels in terms of feature maps. Using the canonical feature map: $$ \begin{split} \phi: \mathbb{R}^D &\to \mathcal{H}\\ x &\mapsto k(\cdot, x)\,, \end{split} $$ we can write the same equation terms as: $$ \begin{align} k(x,X)k(X,X)^{-1}k(X,x) &= \phi(x)^T\Phi(\Phi^T\Phi)^{-1}\Phi^T\phi(x)\\ k(x,x) &= \phi(x)^T\phi(x) \end{align} $$ where $\phi(x)^T\phi(x') = \langle \phi(x), \phi(x')\rangle$ (inner product), and $\Phi := [\phi(x_1), \dots, \phi(x_N)]$. So the same that happened in the linear kernel case $k(x,x') = x^Tx'$ also happens in the more general case $k(x,x') = \phi(x)^T\phi(x')$, i.e., for $x_i \in X$, we have: $$\phi(x_i)^T\Phi(\Phi^T\Phi)^{-1}\Phi^T\phi(x_i) = \phi(x_i)^T\phi(x_i),$$ since $\phi(x_i)$ is one of the columns of $\Phi$.

For a general $x \in \mathbb{R}^D$, we can follow some intuition based on Gaussian processes. The posterior variance of a noise-free Gaussian process is given by: $$ \sigma^2(x) = k(x,x) - k(x,X)k(X,X)^{-1}k(X,x)\,. $$ So the posted equation is satisfied for $x\in \mathbb{R}^D$ such that $\sigma^2(x) = 0$. If $k$ corresponds to a stationary covariance function, then $\sigma^2(x) = 0$ for every point in the dataset $X$. For a non-stationary $k$, it depends on the type of kernel. Periodic kernels, for example, will have $\sigma^2(x) = 0$ for points in $X$ and then other points repeating throughout the domain on a periodic pattern.

Under what kernels and/or conditions does $k(x, x) = k(x, X) k(X, X)^{-1} k(X, x)$?

There are 1 best solutions below

Related Questions in STOCHASTIC-PROCESSES

Related Questions in REPRODUCING-KERNEL-HILBERT-SPACES

Trending Questions

Popular # Hahtags

Popular Questions