I am reading Chapter 4 of Gaussian Processes for Machine Learning. It says that a matrix $K$ whose entries are computed as $k_{ij} = k(x_i, x_j)$ where $k$ is a covariance function is a positive semidefinite matrix.
Consider the exponential kernel
$$ k(x_i, x_j) = k(|x_i - x_j|) = k(d_{ij}) = \exp\left(-\frac{d_{ij}}{10}\right) $$
and the matrix
$$ D = (d_{ij}) = \left(\begin{array}{cccc} 0 & 1 & 3 & 1\\ 1 & 0 & 2 & 3\\ 3 & 2 & 0 & 1\\ 1 & 3 & 1 & 0\\ \end{array}\right). $$
I compute my potential covariance matrix $K$ by evaluating the kernel at the corresponding entries of $D$, that is,
$$K = (k_{ij}) = (k(d_{ij})).$$
The resulting matrix, however, seems to have one negative eigenvalue as shown in the following snippet of MATLAB code:
D = [ 0, 1, 3, 1;
1, 0, 2, 3;
3, 2, 0, 1;
1, 3, 1, 0 ];
eig(exp(-D/10))
ans =
-0.0314
0.2156
0.3078
3.5080
I assume that I misunderstood the material given in that chapter, and I would be grateful if somebody could point out my mistake. Thank you!
Best wishes, Ivan
The answer I had been looking for was given in the comments by @Did:
Is D supposed to be a matrix of distances? Because here, $d_{24}=3>1+1=d_{21}+d_{14}$, hence it is not. If you are referring to (4.18) (for $\gamma=1$), indeed this yields a covariance function when $|r|$ in the formula is a distance since $r$ stands for $x_i−x_j$, but not for every set of “distances” $|r|$. This is why your matrix $D$ fails, I believe.