How to avoid ill-conditioned covariance systems?

519 Views Asked by At

A well-known issue with linear systems defined by covariance functions is that of ill-conditioning [Mohammadi et al. 2017, Ababou et al. 1992]. The solution to this issue is usually preconditioning (e.g. add a noise to the diagonal, or do something more sophisticated). However, I would like to understand better the root of the illness of such systems based on a simple toy example.

Suppose we are given a set of 10 points in $\mathbb{R}^2$, all stacked in a matrix $X \in \mathbb{R}^{2\times 10}$:

$$ X = \begin{bmatrix} 93.0 & 90.0 & 89.0 & 94.0 & 93.0 & 97.0 & 95.0 & 88.0 & 96.0 & 98.0 \\ 40.0 & 33.0 & 34.0 & 36.0 & 30.0 & 39.0 & 39.0 & 28.0 & 25.0 & 35.0 \end{bmatrix} $$

and a target location $x_o = \begin{bmatrix}97.0 \\ 32.0\end{bmatrix} \in \mathbb{R^2}$. Visually, the matrix and target location encode a spatial arrangement of points in the plane:

spatial arrangement of points

Given a covariance function like $cov(x,y) = e^{-\left(\frac{||x-y||}{20}\right)^2}$, we can evaluate it pairwise between all points in $X$, and produce a covariance matrix $C \in S_+^{10\times 10}$:

$$ C = \begin{bmatrix} 1.0 & 0.865022 & 0.878095 & 0.95839 & 0.778801 & 0.95839 & 0.987578 & 0.655406 & 0.557106 & 0.882497 \\ 0.865022 & 1.0 & 0.995012 & 0.939413 & 0.955997 & 0.80856 & 0.858559 & 0.930066 & 0.778801 & 0.843665 \\ 0.878095 & 0.995012 & 1.0 & 0.930066 & 0.923116 & 0.800515 & 0.858559 & 0.911649 & 0.722527 & 0.814647 \\ 0.95839 & 0.939413 & 0.930066 & 1.0 & 0.911649 & 0.955997 & 0.97531 & 0.778801 & 0.731616 & 0.95839 \\ 0.778801 & 0.955997 & 0.923116 & 0.911649 & 1.0 & 0.784664 & 0.80856 & 0.930066 & 0.918512 & 0.882497 \\ 0.95839 & 0.80856 & 0.800515 & 0.955997 & 0.784664 & 1.0 & 0.99005 & 0.603506 & 0.611097 & 0.95839 \\ 0.987578 & 0.858559 & 0.858559 & 0.97531 & 0.80856 & 0.99005 & 1.0 & 0.65377 & 0.611097 & 0.939413 \\ 0.655406 & 0.930066 & 0.911649 & 0.778801 & 0.930066 & 0.603506 & 0.65377 & 1.0 & 0.833185 & 0.68901 \\ 0.557106 & 0.778801 & 0.722527 & 0.731616 & 0.918512 & 0.611097 & 0.611097 & 0.833185 & 1.0 & 0.771052 \\ 0.882497 & 0.843665 & 0.814647 & 0.95839 & 0.882497 & 0.95839 & 0.939413 & 0.68901 & 0.771052 & 1.0 \end{bmatrix} $$

The condition number of this covariance matrix is $1.175329314060857e6$.

  1. Is there any study or strategy for improving the condition number by looking at the spatial arrangement of the points in $X$? Suppose that we are allowed to discard some of the points.
  2. How to identify the points in $X$ that are causing the system to be ill-conditioned?

If we attempt to solve the system $Cw = c$ where $c$ is the covariance between all points in $X$ and $x_o$, the solution can be quite wrong, and this is only due to the numerical issue described above.

I appreciate if you have any recommendations or publications addressing this issue.