Matrix input (multi-dimensional) Gaussian Process Regression

96 Views Asked by At

In the book GPML (http://www.gaussianprocess.org/gpml/chapters/RW.pdf), input is $\mathbf{x}_i \in \mathbb{R}^d$. Stacking $N$ such inputs, i.e. $X = \begin{pmatrix} x_1 & \dots & x_N \end{pmatrix}^T$ results in $X \in \mathbb{R}^{N \times d}$. Given the training output $Y$ of appropriate dimension,

$$Y \sim \mathcal{GP}(\mathbf{0}, K)$$

where $K \in \mathbb{R}^{N \times N}$ is the covariance matrix. Now, depending on the choice of kernel function, $k(p, q)$, it can essentially be written as

$$k: \mathbb{R}^d \times \mathbb{R}^d \rightarrow \mathbb{R}$$

where ${p, q}$ represent the indices.

My question: For $N$ matrix inputs $x_i \in \mathbb{R}^{d \times n}$, this is how I think the stacking would be done, i.e. $X = \begin{pmatrix} x_1 & \dots & x_N \end{pmatrix}^T$, like block matrices which results in $X \in \mathbb{R}^{Nn \times d}$. Again, following the theme,

$$Y \sim \mathcal{GP}(\mathbf{0}, K)$$

What dimension should $K$ be?

  • If the kernel function is about relating inputs to each other, then the covariance matrix is given by $K \in \mathbb{R}^{N \times N}$ and the kernel function $k(p,q)$ defines a mapping

$$k: \mathbb{R}^{n \times d} \times \mathbb{R}^{n \times d} \rightarrow \mathbb{R}$$

relating the $N$ inputs to each other.

  • If the kernel function is about relating indices to each other, then $K \in \mathbb{R}^{Nn \times Nn}$ characterized by kernel function $k(p,q)$ defines a mapping

$$k: \mathbb{R}^{d} \times \mathbb{R}^{d} \rightarrow \mathbb{R}$$

relating the $Nn$ indices (similar to the vector input case) to each other.

  • Or should I approach the stacking in $X$ differently, i.e. $X \in \mathbb{R}^{N \times d \times n}$, a third order tensor?

If my question is unclear please let me know and I can add necessary information. Please feel free to point out any obvious mistakes.