I was just perplexed about the practical usage of topological fingerprints coming out from persistence homology approaches. Once I've obtained persistence diagrams, how do I effectively use them to train Machine Learning Models? Do I proceed using some vectorization scheme on the $(birth\, , \,death)$
intervals or something similar?
Many Thanks,
James
This is a pretty broad question, but I'll give some insight. If one tries to use these topological features in a neural net, it's highly unlikely that a neural net would be unable to learn these features. That's why you are unlikely to get much better (if at all) classification than using a standard neural net.
One approach that will probably be more useful is using the topology/geometry of the data to increase training speed. So, you'd add on a 'topological' layer to vastly reduce the amount of data you'd need to use. You would have to worry about noise (and all the standard arbitrary thresholds). For example, if you're training a neural net for MNIST, you can first add a topology layer that would distinguish e.g. the sets $\{0,4,6,9\}, \{2,3,5,7\}, \{8\}$ (or maybe 6 might be with 8 & 3 and 5). So, you're using some inherent shape of the data before you start dealing with all the random features the neural nets come up with.
A very basic approach to get started would be exactly what you said - or you could count the number of 'significant' generators, and add that as a feature vector. You can be creative in how you use the information, as there isn't any really 'wrong' thing you can do. For example, you could create a vector that records how many significant generators there are that last for longer than "x" time, as well as "y" time, so on.