I have a matrix of samples vs. features and would like to organize or permute the features such that the sum of the first derivative along these features is minimized for all of the samples within the matrix.
The goal of this would be to have an algorithm to re-create a continuous x-axis ordering for several samples if the x-axis is permuted, given sufficient samples, in order to increase the system's ability to minimize the derivative across all samples.
I suppose this could be stated more formally as:
Given an $n \times p$ matrix of $n$ samples and $p$ features,
$$ \bf{X} = \mathbb{R}^{n \times p} $$
where each sample has previously been described as a continuously differentiable function:
$$S = f(p)$$
but then is permuted along the $p$ axis to give $\bf{X}'$, where the samples now have discontinuous features along $p$.
Find the correct permutation matrix, $\bf{\sigma}^{p \times p}$, such that
$$\min \huge( \normalsize{\sum_{n}\sum_{p}(\frac{dS}{dp})}\huge) \normalsize \rightarrow \bf{\sigma} $$
$$ \sigma \times \bf{X'}= \bf{X}$$
For my problem, I have the assumption that, at some point, the samples were created from a summation of Gaussian or other normal distributions.
Here is an example of $\bf{X}$, wherein the samples were a linear combination of Gaussian distributions:
$$ \bf{X} = $$
Now if the features are permuted, we receive the following matrix:
$$ \bf{X'} = $$

So, my main question is: Given this X', how can you find the correct ordering of p to receive the original matrix X?
Attempts:
I have tried using argsort or distance matrices as inputs into Sinkhorn normalization for the prediction for $p$ to be in a certain row. After Sinkhorn Normalization
[1]: I thought it would return something resembling the permutation matrix I am seeking $\sigma$. However, the results have not been very fruitful.
While this is not dependent on Gumbel-Sinkhorn, I did find the answer in a very common spot. Apparently, this is a traveling salesman problem. By utilizing each feature as a point in $n$-dimensional space, and then solving the traveling salesman problem, the "list of cities" returned is simply the indices which will sort the matrix back to its original unpermuted form.
TLDR: It's the traveling salesman problem