I have a set of points $\{X_i\}_{i=1..N}$ in $\mathbb{R}^n$ and distances between pairs of points $d_{ij}$ (not in the mathematical sense of distances, but just some cost function that is $0$ for $i=j$, and that increases as points are deemed further).
$n$ is large (around a million), and $N$ isn't so large (in the order of hundreds).
I would like to compute a low-rank quadratic form $H$ that best approximates these distances. The goal is to obtain a rank-$k$ matrix $H$, with $k \ll n$ such that $(X_i - X_j)^T H (X_i - X_j) \approx d_{ij}$. I am more interested in well approximating small distances than bigger ones.
We have experimented with Gaussian processes but this solution was not working well. Is this problem known ? Are there numerical solutions ? It seems somewhat related to PCA and MDS, but I can't make this relation clear.
Thanks!
I don't think there is any good approximation in the general case. However, here's a practical try if it can be of any help.
Assuming the $\{X_i\}$ are pairwise different, let's consider, for $i \neq j$: $$ G(i,j) = d_{ij}\frac{(X_i-X_j)(X_i-X_j)^\top}{((X_i-X_j)^\top(X_i-X_j))^2}. $$
Note that the matrices $G(i,j)$ verify: $(X_i - X_j)^T G(i,j) (X_i - X_j) = d_{ij}$.
The mean of the $\{G(i,j)\}$ would be the best approximation in the sense of the least square, however still a bad estimate.
As $N$ is low before $n$, one can be tempted to pick the $(i,j)$ couples that have the $k$ lowest $d_{ij}$, and use: $$ \hat{H} =\frac{1}{k} \sum_{p=1}^k G(i_p,j_p) $$
which is of rank at most $k$ (as a sum of matrices of rank $1$).
This could give a good estimate for some datasets.
A better fit could be a parametric estimate $$ \hat{H}(\alpha) =\frac{1}{k} \sum_{p=1}^k \alpha_p G(i_p,j_p) $$ and fit the $(\alpha_p)$ with gradient descent.