How is Simulated Annealing an "Adaptation of The Metropolis-Hastings Algorithm"?

194 Views Asked by At

I'm trying to understand how simulated annealing is related to the Metropolis-Hastings algorithm, that is, if they are at all. The Wikipedia page states that simulated annealing "is an adaptation of the Metropolis–Hastings algorithm", and so far what I've come to think of it as is that simulated annealing basically uses the metropolis algorithm so the system relaxes into its ”ground state” to make the distribution easier to evaluate, and then slowly decreases the temperature to end up only accepting better points, so it eventually finds the global optimum.

I understand the Monte-Carlo Markov Chain aspect of their technical similarity; my main difficulty in seeing the connection is in the difference in the acceptance criterion (transition kernel) of both methods, between which I can't see any connection - one is a "simple" ratio between the current and the candidate, and the other is derived from the Boltzmann statistical distribution.

Is there an actual mathematical relationship between the two methods on the basis of their acceptance criterion, and is simulated annealing really a direct adaptation of the Metropolis-Hastings algorithm?