I have some data, thus I have its empirical distribution. I want to use a theoretical distribution to fit my data. For example, I observe my data is likely to be distributed as Pareto, so I use Pareto to fit. But the major point is to estimate the parameters in Pareto, so I tried MLEs. Also I tried to minimize KLD which can be proved to be equivalent to MLEs. I am wondering if I want to minimize some other probability distance, are there any beautiful properties for that? I found some measures on the distances here https://en.wikipedia.org/wiki/Statistical_distance.
For instance, minimizing one of the distance, I will get an estimator whose expectation is just the true parameter asymptotically.