Why do we use the exponential in the Boltzmann distribution

69 Views Asked by At

Low effort question incoming. Given a set of states $x_i,i=1,\dots,n$ with energy $0\leq U(x_i)$, we define the probability of a state $x$ as $$ \pi(x)=\frac{1}{Z_T}e^{-\frac{1}{T} U(x)} $$ where $Z_T$ is just a normalization constant. According to the Wikipedia this distribution is the maximum entropy distribution on the set of states for a given mean (so I guess you can tune $T$ to get any mean). Then I've seen this in Wikipedia https://en.wikipedia.org/wiki/Superstatistics where they define $\beta=\frac{1}{T}$ and given a distribution $f(\beta)$ of the inverse temperature on $(0,\infty)$ they sort of extend the definition to $$ \pi(x)=\frac{1}{Z_\beta}\int_{0}^\infty d\beta f(\beta)e^{-\beta U(x)} $$ I don't really know if this theorem https://en.wikipedia.org/wiki/Bernstein's_theorem_on_monotone_functions goes in the direction of expected value of exponentials implies complete monotonicity. I have two sort of informal questions:

  1. What is it with exponentials and this expected exponentials? What would be lost by picking a function $x<y \Rightarrow f_T(x)<f_T(y)$ and define $\pi(x)\propto f_T(U(x))$? Is complete monotonicity somehow playing an important hidden role?
  2. Why do people use the Boltzmann distribution for MCMC in the TSP problem for example? Why not any other function that makes a delta at the minimum when $T\rightarrow 0$? Is there some kind of optimal convergence?