Why is it so difficult to sample from a Boltzmann distribution?

2.4k Views Asked by At

I am studying simulated annealing, and simulated annealing involves a Boltzmann distribution.

Typically I see the Boltzmann distribution written like:

$P(E_i) = e^{-E_i/(kT)}/Z$

where $Z$ is the sum of all of the energy in all of the states, and $P(E_i)$ is the energy in just one of the states.

The thing is in simulated annealing I see the pdf written with an integral:

$P(x) = \frac{e^{-f(x)/T}}{\int_{S}e^{-f(z)/T}dz}$

I understand from many sources that the Boltzmann distribution is reportedly very difficult to sample from because it involves this $f(z)$ as you can see above. I have two questions about that:

1) Why does the $f(z)$ not appear in the first equation I gave for the Boltzmann distribution? (e.g., it does not appear here on wikipedia: https://en.wikipedia.org/wiki/Maxwell–Boltzmann_distribution)

2) I'm able to fire up python and use scipy to easily sample from a Boltzmann distribution just by doing

import scipy from scipy import stats import numpy as np import matplotlib.pyplot as plt r = scipy.stats.boltzmann.rvs(lambda_=1.4, N=4, size=10)

without trouble. (source: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.boltzmann.html#id23)

So what is the big deal with sampling from a Boltzmann distribution? There are all these fancy methods in Simulated Annealing like Hit and Run, Metropolis sampling, etc.

Why are they necessary when I was able to sample from a Boltzmann distribution with ease?

2

There are 2 best solutions below

6
On

In the first equation, $f(z)$ is represented by $E_i $. This term is used to represent the energy repartition. It is just a continuous version of the Boltzmann disitribution (and not Maxwell-Boltzmann, which is a different law).

In addition, the main problem is the computation of $Z$.

In most pratical problems requiring such distribution, the number of states is enormous, and each of them has his proper energy.

Imagine you want to find the shortest way to travel through $N$ cities, you have to pass only once by city and finish by the one you started from. Then, you have $0.5(N-1)!$ possibilities. For $N=20$ only, you already have about $6\times10^{16}$ different states to consider.

This is why we need algorithms such as Metropolis, to avoid having to compute $Z$.

1
On

You may either have quantised energies or continuum of energies in your system. If you have a discrete set $\{E_{k}\}_{k}$ where $k$ is some index tuple (e.g. quantised energy, quantised angular momentum, spin, isospin etc....), Boltzmann says that a probability for a system to be be in the state characterised by $E_{k}$ is given by $$p_{k}=\frac{1}{Z}e^{-\beta{E}_{k}}$$ Where the normalization $Z$, known as the partition function is given by $$Z(\beta)=\sum_{\forall{k}}e^{-\beta{E}_{k}}$$ An example of a system may be a spin-$1/2$ particle in a harmonic trap then $k=(n, \sigma)$ with $n\in\mathbb{N}\cup\{0\}$ and $\sigma\in\{-1/2, 1/2\}$ and $E_{n, \sigma}=\omega(n+1/2)$. On the other hand, one may have a system with continuum of energy $\{E({k})\}_{k}$ where $k$ now belongs to some uncountable set, in that case, Boltzmann says that a probability to find the system in the infinitesimal energy window $[k, k+dk]$ is given by $$p(k)=\frac{1}{Z}e^{-\beta{E}(k)}dk$$ With $$Z(\beta)=\int_{\forall{k}}e^{-\beta{E}(k)}dk$$ Example may be a particle on $1D$ lattice with nearest neighbour hopping, then $E(k)=E_{0}-\alpha\cos{k}$, where $k\in[-\pi, \pi]$. And you also may have a mixed situation. E.g. some of the indices are discrete and some are continuous, then you have to sum over discrete and integrate over continuum.