I need some help potentially applying machine learning or curve fitting techniques to a numerical linear algebra problem I am working on.
I have a parameterized symmetric matrix $M(s)$ whose eigenspectrum I want to study numerically as a function of $s$. The matrix is specifically the Hamiltonian of some physical system - the details aren't too important. Numerically, I'll pick some finite step size for $s$ (e.g. in NumPy via np.linspace(0, 1, 101)) and then diagonalize $M(s)$ at each value using np.linalg.eigh(). I thus get an ordered list of real eigenvalues $\lambda_k(s)$ indexed by $k$ and their corresponding eigenvectors $V_k(s)$.
Now, I have a given vector $V_\star = V_0(0)$ and I want to study overlaps of the parameterized eigenvectors with this target vector. I do this by plotting the overlaps $O_k(s) = |V_\star^T V_k(s)|$ of each vector with the target $V_\star$, i.e. for all $k$. I generically get something that looks like this:
The colors on this plot specifically denote the index $k$, which as a reminder just indexes the eigenvalues in the order they're returned from NumPy. As eigenvalues vary with $s$, they will cross each other and so the color changes.
Now, here's the challenge: I essentially want to track the parameterized eigenvectors $V_k(s)$ that are "continuously connected" to $V_\star$. As a human, I can stare at the left plot and identify the "continuous" curve that starts at 1 and goes down to about 0.2. As it does this, the color changes from purple to brown to purple, etc., but I "know" that that's the one I want. I've highlighted this curve as a black solid line on the right plot in case it's not clear.
Essentially, at $s=0$, we have just a single eigenvector $V_0(0)$ with nonzero overlap to $V_\star$, but as $s$ increases, we start to have many vectors all with significant overlap. What I want is the indices of all of the eigenvectors along the black solid line path.
Obviously, I can manually go in and select the data points by hand. But I'm looking for an automated or algorithmic way to approach this problem. Some ideas I have include some type of iterative polynomial/leastsq curve fitting, or maybe there is a machine learning / computer vision approach to essentially train my computer to identify the desired curve from the unconnected data points. Any suggestions on ways to solve this would be greatly appreciated!
Edit: As a comment points out, if there was a way to automatically detect "crossings" in the eigenvalues, this would be fairly solvable. However, I don't have a great way to do this. As I show below in a subset of the spectrum, we actually have anti-crossings in the eigenvalues, not crossings. Presumably, if I could detect every such anti-crossing and then swap colors (e.g. purple $\leftrightarrow$ green below), then that would also help towards solving my problem. But, I don't see a great way to do that...
$O_k(s)$ vs. 
One can use the following definition of an avoided crossing (anticrossing) between neighboring, ordered, eigenvalues at $s=s_c$:
$$\lambda'_{k+1}(s_c)-\lambda'_{k}(s_c)=0, ~~~\lambda''_{k+1}(s_c)-\lambda''_{k}(s_c)>0 $$
which is basically the definition of a local minimum for the vertical distance between two neighboring eigenvalues. This definition does not cover crossings with zero minimum distance between them, but numerically its hard to distinguish between the two anyway. If one is not limited by computational resources and can go to very small resolutions, then it could make sense to numerically approximate the first derivative and test it for a change of sign; this is guaranteed to detect any crossing that is wider than the resolution, and can detect smaller crossings whose asymptotic behavior falls within the resolution limit.
An algorithm that can detect all minima of a function can therefore detect every crossing between the eigenvalues of a matrix, by applying it to every consecutive pair of eigenvalues. To my current knowledge, no cheap algorithm to solve this problem exists, and since diagonalizations are usually expensive resourcewise, I will suggest a quick and dirty solution that will catch most crossings, that I have used to detect crossings in my work.
Suppose your computational resolution in the interval $(0,s_0)$ is $\Delta s=\frac{s_0}{N}$.Then it is easy to put an lower bound on the number of crossings of width $W\ll \Delta s$, simply compute the quantity
$$\Delta^2D_n=D_n((k+2)\Delta s)-2D_{n}((k+1)\Delta s)+D_n(k\Delta s)\\ ~\\D_n(s)=\lambda_{n+1}(s)-\lambda_n(s)$$
This quantity should detect sharp changes in eigenvalue directions as long as
$$\max_{(0,s_0)}|D_n''(s)|(\Delta s)^2\ll 1$$
If this condition is fulfilled you can declare that a crossing happens whenever you detect a large second difference, greater than some cutoff
$$\text{crossing at } (k+1)\Delta s\iff\Delta^2 D_n>\alpha(\Delta s)^2$$
where $\alpha$ is a number that depends on what your typical data looks like. This should detect all crossings with width smaller than a few times $\Delta s$, because then your resolution is bad enough that you can't see a smooth representation of the crossing, just its asymptotic behavior.
My suggestion is to run the algorithm for various resolutions, refining your grid at each step, and keeping track of the second differences to detect crossings that are getting resolved as you increase the resolution, until the number of "detected" crossings starts converging. In this way you will detect crossings across a large hierarchy of widths.
Additionally, if you want to be able to tell apart crossings from anticrossings, you can use the following: locally around $s=s_c$, all 4 parameters describing the isolated crossing can be extracted by fitting the following two functions to data around the crossing points:
$$ (\lambda_{n+1}(s)-\lambda_n(s))^2=A^2+B^2(s-s_c)^2\\~\\ \lambda_{n+1}(s)+\lambda_n(s)=C+D(s-s_c)$$
A crossing will have $A=0$ and $A\neq 0 $ describes an anticrossing.
Some downsides of this algorithm: If more than two eigenvalues are involved in a crossing, you may miss a crossing without sufficient resolution, but generally 3-state crossings are extremely rare. Related to this effect, extremely small crossings will not be detected if the resolution is coarse enough that it completely misses the asymptotic behavior of the crossing. This effect is difficult to quantify, but it depends on the size of matrix elements $\langle\lambda_n(s)|H|\lambda_{n+1}(s)\rangle$ and its size relative to the diagonal matrix elements and other off-diagonal matrix elements. Finally, even though extremely wide crossings could in principle be detected by performing simulations at a very coarse resolution, it may be difficult to distinguish a very wide crossing from the background noise at low resolutions, since then usually the crossing parameter $B\simeq \Delta^2 D_n/2$ is extremely small, and we can only detect crossings that obey
$$2B>\alpha (\Delta s)^2$$
which means that the width of crossings $W\propto A/B$ that we can detect is limited to the range
$$W\lesssim \min(\frac{2A}{\alpha (\Delta s)^2}, \Delta s)$$
This formula clearly shows that when the resolution is coarse the maximum width is limited by noise and when the resolution is small it is limited by the fact that wide crossings look smooth. Obviously, when the resolution is small, wide crossings can be detected by other means; for example one can numerically approximate the derivative $D'_n(s)$ and detect a change in sign, which is a test based on the definition itself. With limited computational resources, one may have to resort to a clever combination of both algorithms to efficiently and automatically detect crossings across all scales.