Symbolic entropy maximization in SymPy

792 Views Asked by At

My goal is to use SymPy to solve certain problems of entropy maximization that are common in Physics in the field of statistical mechanics. Hence, I will:

  • Introduce one of the simplest versions of such problems.
  • Show my attempt to solve it in SymPy.
  • Criticize such attempt and ask my question, accordingly.

The simple entropy maximization problem I have mentioned is formulated as follows. We would like to maximize the entropy

$H = - \sum_x P_x \ln P_x$

subject to the following constraints: i) the normalization constraint

$1 = \sum_x P_x$

and ii) the constraint of average energy

$U = \sum_i E_x P_x$

Here the index $i$ runs over $x=1,2,...,n$. $E_x$ represents the energy of the system when it is in microscopic state $x$ and $P_x$ is the probability for microscopic state x to occur in the thermodynamic state of the system.

The solution to such a problem can be obtained by the method of Lagrange multipliers. In this context, it works as follows.

Firstly, the following Lagrangian is defined

$L = H + a( 1 - \sum_i P_x ) + b( U - \sum_i P_x E_x )$

Here, $a$ and $b$ are the Lagrange multipliers. The Lagrangian $L$ is a function of $a$, $b$ and the probabilities $P_x$ for $x=1,2,...,n$. The term $a( 1 - \sum_x P_x )$ correspond to the normalization constraint and the term $b( E - \sum_x P_x E_x )$ to the energy constraint.

Secondly, the partial derivatives of $L$ with respect to $a$, $b$ and the $P_x$ for the different $x=1,2,...,n$ are calculated. These result in

$\frac{\partial L}{\partial a} = 1 - \sum_x P_x$

$\frac{\partial L}{\partial b} = E - \sum_x E_x P_x$

$\frac{\partial L}{\partial P_x} = \frac{\partial H}{\partial P_x} - a - b E_x = - \ln P_x - 1 - a - b E_x$

Thirdly, we find the solution to the entropy maximization problem by equating these derivatives to zero. This makes sense since there are $2+n$ equations and we have $2+n$ unknowns: $a$, $b$ and the $P_x$. The solution from these equations reads

$P_x = \frac{\exp( - b E_x )}{Z}$

where

$Z = \sum_x \exp( - b E_x )$

is called the partition function and $b$ is implicitly determined by the equation

$E = \sum_x P_x E_x = \frac{1}{Z} \sum_x \exp( -b E_x ) E_x$

This completes the "hand-made" construction of the solution of the problem of entropy maximization.

Now, lets try to "imitate" such construction using SymPy. The idea is to automatize the process so, eventually, I can attack similar but more complicate problems just by "crunching" with the code. The following is my attempt up to now:

# Lets attempt to derive these analytical result using SymPy.

>>> import sympy as sy
>>> import sympy.tensor as syt

>>> # Here, n is introduced to specify an abstract range for x and y.
>>> n = sy.symbols( 'n' , integer = True )
>>> a , b = sy.symbols( 'a b' ) # The Lagrange-multipliers.
>>> x = syt.Idx( 'x' , n ) # Index x for P_x
>>> y = syt.Idx( 'y' , n ) # Index y for P_y; this is required to take derivatives according to SymPy rules.

>>> P = syt.IndexedBase( 'P' ) # The unknowns P_x.
>>> E = syt.IndexedBase( 'E' ) # The knowns E_x; each being the energy of state x.
>>> U = sy.Symbol( 'U' ) # Mean energy.
>>> 
>>> # Entropy
>>> H = sy.Sum( - P[x] * sy.log( P[x] ) , x )
>>> 
>>> # Lagrangian
>>> L = H + a * ( 1 - sy.Sum( P[x] , x ) ) + b * ( U - sy.Sum( E[x] * P[x] , x ) )

>>> # Lets compute the derivatives
>>> dLda = sy.diff( L , a )
>>> dLdb = sy.diff( L , b )
>>> dLdPy = sy.diff( L , P[y] )

>>> # These look like
>>> 
>>> print dLda
-Sum(P[x], (x, 0, n - 1)) + 1
>>> 
>>> print dLdb
U - Sum(E[x]*P[x], (x, 0, n - 1))
>>>  
>>> print dLdPy
-a*Sum(KroneckerDelta(x, y), (x, 0, n - 1)) - b*Sum(KroneckerDelta(x, y)*E[x], (x, 0, n - 1)) + Sum(-log(P[x])*KroneckerDelta(x, y) - KroneckerDelta(x, y), (x, 0, n - 1))

>>> # The following approach does not work
>>> 
>>> tmp = dLdPy.doit()
>>> print tmp
-a*Piecewise((1, 0 <= y), (0, True)) - b*Piecewise((E[y], 0 <= y), (0, True)) + Piecewise((-log(P[y]) - 1, 0 <= y), (0, True))
>>>     
>>> sy.solve( tmp , P[y] )
[]
>>> # As we can see, no solution was found for P[y]

>>> # Hence, we try an ad-hoc procedure
>>> Px = sy.Symbol( 'Px' )
>>> Ex = sy.Symbol( 'Ex' )
>>> tmp2 = dLdPy.doit().subs( P[y] , Px ).subs( E[y] , Ex ).subs( y , 0 )
>>> print tmp2
-Ex*b - a - log(Px) - 1
>>> Px = sy.solve( tmp2 , Px )
>>> print Px
[exp(-Ex*b - a - 1)]
>>> # This is the solution we wanted to find. After asking for normalization the constant "a" can be absorbed into 1/Z.

My critics and questions are the following:

  • I do not like the idea of using the "had-hoc" procedure. Why solve is not able to find the solution for P[y]?
  • Is there another more "correct" or preferable way to do this?.