Topological data analysis employs topology to study discrete multidimensional data sets. One often treats these data as point clouds embedded in $\mathbb{R}^n$. And in practice, it may be hard to extract useful information about our data using topological ideas, since the methods ask us to analyze structures that get quite complicated when you have MANY data points. The 'method' I specifically have in mind is persistent homology, the 'structure' being the simplicial complex.
My question is pretty simple. Suppose we expect our point cloud not just to embed into $\mathbb{R}^n$, but really in (or in some $\epsilon$-neighborhood) of some subspace, e.g. a smooth embedded submanifold, e.g. $S^{n-1}$.
In any of the following cases:
- We expect the data to exist in an arbitrary subspace
- We expect the data to exist in an arbitrary submanifold
- We expect the data to exist in a specific submanifold (e.g. $S^{n-1}$)
Are there algorithms to let us compute persistent homology more efficiently than had we not had prior knowledge of the shape of the range?
Of course, one could point out that this information means we can rule out certain proximity values when creating the complex. I'm looking for something less trivial.
I'd expect the answers in the case of (1) and (2) to be the same. And I should mention that I'm not too familiar with persistent homology or TDA in general.
There are several different ways to build a simplicial complex on top of point cloud data in $\mathbb{R}^n$, including alpha complexes, Cech complexes, and Vietoris-Rips complexes.
Alpha complexes are very fast to compute in $\mathbb{R}^n$ for $n$ not too large; certainly $n=2,3,4$ are not a problem. So if your data happens to lie in a 2-, 3-, or 4-dimensional subspace of Euclidean space, and you know this in advance, then you can just compute an alpha complex in $\mathbb{R}^n$ for $n=2,3,4$ very efficiently. But rarely would you know this in advance, perhaps.
Alpha complexes are subcomplexes of both Delaunay triangulations and Cech complexes. I know folks have studied alpha complexes built on point sampled from low-dimensional manifolds in high-dimensional space, and similarly for Delaunay triangulations and Cech complexes. Many such papers study the mathematical theory; I don't know of as many that study the computational efficiency in this context. Alpha complexes and Cech complexes are both homotopy equivalent to the union of the balls, so the homology that one obtains is the same as the homology of the union of the balls, which is nice.
My expertise is on Vietoris-Rips complexes. It is counterintuitive, but sometimes you can build a Vietoris-Rips complex on a very low-dimensional space but obtain high-dimensional homology. For example, homology in any dimension is possible from Vietoris-Rips complexes of finite point sets in the plane $\mathbb{R}^2$. As explained at the paper https://arxiv.org/pdf/1812.03374.pdf, if one knows that their points are sampled from a circle or from a curve that is sufficiently close to a circle, then one can compute persistent homology of the Vietoris-Rips complex in running time near-quadratic in the number of vertices, which is much faster than typical persistent homology algorithms that are cubic in the number of simplices. Of course, the number of simplices can be exponential in the number of vertices. But this situation of having points on or near a circle is rare. See Question 6 at that arXiv paper for a nice open question about efficiency of computing persistent homology of Vietoris-Rips complexes of finite point sets in the plane. Some is known here, but not too much. The case of $\mathbb{R}^3$ for Vietoris-Rips complexes (see Question 7) is wide open.