I am going through the paper "Information theory and statistical mechanics" by E.T. Jaynes, and at equation (2-4), he derives the probability distribution to maximize entropy given we know the expectation value of the function $f(x)$ as such using Lagrange multipliers: $$ p_i = e^{- \lambda-\mu f(x_i)} $$
When maximizing the Shannon entropy, we want to maximize equation (2-3): $$ H(p_1 ... p_n) = -K\sum_i p_i ln(p_i) $$ while subject to the constraints, known expectation value of $f(x)$ in equations (2-1), and normalization condition (2-2):
Equation (2-1) $$ \langle f(x) \rangle = \sum_i p_i f(x_i) $$
Equation (2-2) $$ \sum_i p_i = 1 $$
I try to derive (2-4) in the steps below: Using Lagrange multipliers $\lambda$ and $\mu$ $$ L = -K\sum_i p_i ln(p_i) - \lambda(\sum_i p_i - 1) - \mu (\sum_i p_i f(x_i) - \langle f(x) \rangle) $$
Solving for stationary point of $L$ $$ \frac{\partial L}{\partial p_i} = -K(ln(p_i) + 1) - \lambda - \mu f(x_i) = 0 $$
Using $K=1$ as used by Jaynes $$ ln(p_i) = - \lambda - \mu f(x_i) - 1 $$
$$ p_i = e^{- \lambda - \mu f(x_i) - 1} $$
Now, why was the $-1$ term in the exponent not present in equation (2-4)?