I am interested in under what conditions the frequentist consistency of a Maximum-Likelihood estimator is enough to give the consistency of a maximum-a-posteriori point estimate, with the further restriction of i.i.d. models.
That is, consider $y_1,\ldots,y_n$ $ \overset{\text{i.i.d.}}{\sim}$ $p(y\mid \theta_0)$, and suppose that a sequence of maximum likelihood estimators $\hat{\theta}_n : = \underset{\theta}{\operatorname{argmax}} M_n(\theta),$ where $M_n(\theta):=\ \frac{1}{n} \sum_i\ln p(y_i\mid\theta)$, exists and is consistent, that means $\hat{\theta}_n \overset {\mathbb{P}_{\theta_0}}{\rightarrow} \theta_0$.
Now consider some prior $p(\theta)$ and define the posterior mode estimate as $\hat{\theta}^\text{MAP}_n :=\underset{\theta}{\arg \max} \ M_n(\theta) + \frac{1}{n} \ln p(\theta))$. The question would be now, which conditions are sufficient such that, together with the consistency of $\hat{\theta}_n$, they imply the consistency of the MAP estimate: $\hat{\theta}^\text{MAP}_n \overset {\mathbb{P}_{\theta_0}}{\rightarrow} \theta_0$.
Naturally, one should assume that the prior puts positive mass on $\theta_0$, i.e. $p(\theta) >0$ in some neighborhood of $\theta_0$.
Besides of that , I could find that the following conditions from [1] are sufficient:
Define $M(\theta) = \mathbb{E}_{\theta_0}[\ln p(y\mid \theta)]$. If we have that
$$ \sup_\theta \left| M_n(\theta) - M(\theta)\right| \overset{\mathbb{P_{\theta_0}}}{\rightarrow} 0 $$
$$ \sup_{\theta \,:\, d(\theta,\theta_0) \,\geq\, \varepsilon} M(\theta) < M(\theta_0) , \ \forall \varepsilon,$$
then $\hat{\theta}^\text{MAP}_n \overset {\mathbb{P}_{\theta_0}} {\rightarrow} \theta_0$. The proof is basically the same as in [1]. This, however is unsatisfactory for at least 2 reasons: First, one has to make strong assumptions about the likelihood function, that may be too restrictive. Second, by imposing these restrictions, we are also implicitly proving the consistency of the $ML$ estimator again. It would be nicer to have less assumptions on the likelihood or even have conditions that directly work with the fact that $\theta_n$ is consistent. Regarding the former, I tried to modify other proofs given in [1] regarding consistency of ML-estimates, but with these it seems to me that additional assumptions are needed.
So my question would be: - What are sufficient conditions for consistency of MAP estimates, given consistency of the ML?
Edit:
I will add some thoughts so that one can see the place where I am stuck and maybe someone else knows some solution.
Assume we have that the prior is as above and we can bound it from above. Consider some small $\varepsilon$. Then we have $$ (\hat{\theta}_n^{MAP} \notin B_{\varepsilon}(\theta_0)) \implies M_n(\hat{\theta}_n^{MAP}) + \frac{\ln p(\hat{\theta}_n^{MAP})}{n} \geq M_n(\hat{\theta}_n) + \frac{\ln p(\hat{\theta}_n)}{n} \\ $$
By the consistency of $\hat{\theta}_n$ and the continuity and positivity of the prior, we can replace the last term on the right by $\frac{\ln p(\theta_0)}{n} - \delta$ for some arbitrarily small $\delta$ for large enough $n$, and we can definitely replace the term involving the prior on the left by it's supremum.
We then have $$M_n(\hat{\theta}_n^{MAP}) + \frac{\underset{\theta}{\sup} \ln p(\theta) - \ln p(\theta_0)}{n} + \delta \geq M_n(\hat{\theta}_n)$$
At this point then it seems to me that one needs some additional assumptions. If the convergence to the limit function $M(\theta)$ is uniform, and the limit function is well-separated, then of course the above inequality will be fulfilled with arbitrarily small probability. This is why the conditions given above work.
[1]van der Vaart, A. W., Asymptotic statistics, Cambridge Series in Statistical and Probabilistic Mathematics, 3. Cambridge: Cambridge Univ. Press. xv, 443 p. (1998). ZBL0910.62001.