Lower Dimensional Embedding with Arbitrarily Low Error *and* Dimensions?

73 Views Asked by At

I would like to produce an isomap of all the points in a simplex mesh. After producing this isomap, I still need the approximate all-pairs geodesic distances from the original simplex mesh. To save memory, I would like the distances in the isomap to be within a bounded error of those in the original graph-distance matrix. Then I can dispense with the matrix and look up a pairwise geodesic distance directly from the isomap.

I have produced an isomap using MDS (to $\mathbb{R}^{3}$), and though the errors are not terrible, there are severe outliers which would confound my application. Other constraints require the dimensionality of the isomap to be kept relatively low, though it could go higher than three, it mustn't do by much.

Is there another technique or embedding like MDS or PCA, but that which has bounded error with an arbitrary dimension?

Or, is it possible to produce an isomap such that the original pairwise distances can be recovered with no additional data (save for the function to do so)?

(Related? MDS and low distortion embeddings)

1

There are 1 best solutions below

1
On

If I understand what you are asking, at least in a broad sense, you want to approximate Euclidean distance by distance in a tree. (If this is not what you are asking, perhaps you can further clarify your question).

This is generally speaking not possible, for multiple reasons. Here is one reason why, using a particular example.

Let me imagine that your mesh is the set of points with integer coordinates $(m,n)$, and these points are connected by the horizontal and vertical segments of length 1. Then the number of mesh points which is within Euclidean radius $r$ of $(0,0)$ is approximately $\pi r^2$ for large values of $r$. If you prefer instead to measure distance in the mesh itself using the taxicab metric, the outcome is not changed: the number of mesh points within taxicab radius $r$ of $(0,0)$ is still a quadratic polynomial in $r$.

Consider instead a tree, which I will think of as having four segments meeting at each vertex (as is true in the Euclidean mesh described above). Measuring distance within the tree itself, the number of points which is within tree radius $r$ of a single base point grows as an exponential function of the radius, approximately $C \cdot 3^r$ for some constant $C$.

This means that if you are attempting to use tree distance as an approximation to Euclidean distance, you are bound to have terrible distortion: as $r$ increases, you have exponentially many tree points representing quadratically many mesh points, forcing many many tree points (with high tree distance) to represent the same mesh point (with zero Euclidean distance).