I'm working on an MCMC algorithm for a machine learning project that simulates an adaptive system using a reaction-diffusion SSA framework to build model structure (reactions via Green's rjMCMC) and refine parameters (diffusion via standard MCMC). I've been able to show that the proposal process is irreducible, positive recurrent, and aperiodic (i.e., ergodic) but I don't know how to show that it follows detailed balance. I would like to demonstrate that the approach I'm pursuing will converge to the posterior of interest before I proceed.
I've read elsewhere that ergodicity is a sufficient condition for a proposal for M-H MCMC (e.g., https://jellis18.github.io/post/2018-01-02-mcmc-part1/), but other sites (e.g., https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm) that provide a "formal derivation" list two requirements: (1) existence and (2) uniqueness of a stationary distribution. Under (2), this site lists ergodicity as sufficient, but under (1) it lists detailed balance as sufficient. My understanding was that ergodicity implied both the existence and uniqueness of a stationary distribution, and moreover that this stationary distribution would be the limiting distribution. Here is my question: why is detailed balance listed separately from ergodicity as a requirement for the M-H MCMC proposal when it would appear that ergodicity would satisfy the two requirements? What am I missing here?
Thank you for your time and thoughts.
I can't give you a nuanced theoretical account of the interplay of the conditions underlying convergence of the Metropolis-Hastings MCMC algorithm, as Casella and Robert have done a better job than I ever could in Chapter 6 of "Monte Carlo Statistical Methods", if you want an account using some machinery in stochastic processes. I'm also certain that other members on this website could illustrate the nuances much more fluently than I could.
To the best of my introductory knowledge on MCMC, and also experience with time-series, the most plausible explanation I have for your query on why the detailed balance conditions are not just redundant in light of ergodicity requirements is that ergodicity can be difficult to assess* (although you have already established it in your context).
From a practitioner's perspective rather than someone concerned with the rigorous technicalities of how it all fits together, I refer you to Kevin Murphy's "Machine Learning: A Probabilistic Perspective", section 17.2.3.4 for the detailed balance conditions in Markov Chains, and on their utility here. And also p375 of David MacKay's "Information Theory, Inference and Learning Algorithms" here. Both those expositions are introductory, whilst the Casella and Robert book is a little more involved.
*That's why in some introductory monographs on time-series for practitioners, ergodicity is often not covered, or given a passing mention at best, and instead we check for stationarity (even though the historical roots of the development of time-series in communications engineering was motivated by ergodic theory and statistical mechanics e.g. here).