Computing Maximum Entropy Distribution

80 Views Asked by Bumbble Comm At 27 Mar 2026 - 7:11

I am following a seminar on computing MaxEnt distributions and I am bit confused with the differences between the general (analytical) template and the actual computational procedures followed by optimization packages in R or Python.

The general template is to maximixe the entropy function :

$H(x) = - \int p(x)\ln p(x)dx$

subject to a set of constraints on the moments or other functions of distribution.

In an analytical setting, this requires no other information than the values assigned to the constraints and solving via langrange optimization.

In the setting with only the normalization constraint this is :

$J(p)=\int_{a}^{b} p(x)\ln p(x)dx-\lambda_{0}\left(\int_{a}^{b} p(x)dx-1\right)$

.. which gives us the general solution in terms of $\lambda_{o}$

$p(x)=e^{1 -\lambda_{0}}$

Why do computational procedures such as the "augmented lagrangian" require a "prior distribution" or "inital" set of values in order to compute the optimum ?

See :

https://www.rdocumentation.org/packages/nloptr/versions/1.2.1/topics/auglag

Thank you!

Original Q&A

There are 1 best solutions below

Bumbble Comm On 28 Feb 2020 - 4:17

I would think that this is because the algorithms need to start ``searching'' from somewhere. They hypothesise a solution and then try to improve on it. This would explain the initial set of values. Not sure about the prior distribution (unless you mean to specify an initial $p(\cdot)$ for the algorithm to begin its search.

Computing Maximum Entropy Distribution

There are 1 best solutions below

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in OPTIMIZATION

Related Questions in PHYSICS

Related Questions in STATISTICAL-INFERENCE

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions