In optimal transportation theory, it is better to regularize the Wasserstein distance with an entropy constraint because it is differentiable, unlike its unregularized counterpart. By being differentiable, it can then be treated as a loss function that is compatible with common optimization algorithms.
The entropy-regularized Wasserstein distance (aka Sinkhorn distance) is: $$ \text{inf} _{\gamma \in \Pi } \sum \|x - y\| \enspace \gamma(x,y)- \epsilon H(\gamma)$$ where $H(\gamma) = -\sum \gamma \text{ log}(\gamma)$ is the Shannon entropy of the transport plan $\gamma$, and $\epsilon $ is the regularization parameter.
What then is the derivative of the above formula, could someone show how to derive it?
This paper talks about the derivative of Sinkhorn distances https://papers.nips.cc/paper/2018/file/3fc2c60b5782f641f76bcefc39fb2392-Paper.pdf