KL Divergence and Maximum Likelihood Estimate

59 Views Asked by At

We can show that minimizing $KL(p_{\theta^*}||p_\theta)$ w.r.t $\theta$ is equivalent to MLE of $\theta$ where $\theta^*$ is the true parameter. I am wondering what does minimizing $KL(p_{\theta}||p_{\theta^*})$ w.r.t $\theta$ result in? Is there a known estimator that does this? Examples showcasing the difference are most welcome.