I am struggling to see how to get to the solution of the equation below. The problem equation and solution come from here (see eqn 3 and eqn4). They state that this is a variational problem,
$$ p^{*}(a | w)=\underset{p(a | w)}{\arg \max } \sum_{a} \left[ p(a | w) U(w, a)-\frac{1}{\beta} p(a | w) \log \frac{p(a | w)}{p_{0}(a)} \right] $$
given that $0<p(a | w)<1$ and $\sum_{a} p(a | w) =1$
The answer is:
$$ p^{*}(a | w)=\frac{1}{Z(w)} p_{0}(a) e^{\beta U(w, a)} $$
where $Z$ is the normalisation constant $$ Z(w)=\Sigma_{a} p_{0}(a) e^{\beta U(w, a)} $$
Edit: Added the constraint that p lies between 0 and 1 (as it's a probability)
There's one extra constraint you've omitted—namely $$ \sum_a x_a = 1\ . $$ The Lagrangian function $\ \mathcal{L}\ $for the optimisation problem, then, is given by \begin{eqnarray} \mathcal{L}(x,\lambda) &=& \sum_a \left[x_a U(w,a) -\frac{x_a} {\beta}\log\frac{x_a}{p_0(a)}\right]-\lambda\left(\sum_ax_a-1\right)\\ &=& \lambda + \sum_a x_a\left[ U(w,a) + \frac{\log p_0(a)}{\beta}-\lambda-\frac{\log x_a}{\beta}\right]\ , \end{eqnarray} and the first-order optimality conditions are \begin{eqnarray} 0&=&\frac{\partial L}{\partial x_a}(x,\lambda)\\ &=& \left[ U(w,a) + \frac{\log p_0(a)}{\beta}-\lambda-\frac{\log x_a}{\beta}\right]-\frac{1}{\beta}\ \ \ \mbox{for all $\ a\ $.} \end{eqnarray} These give $$ x_a^*=p_0(a)e^{\beta U(w,a)-\beta\lambda-1}\ , $$ and plugging these values into the constraint $\ \sum_a x_a = 1\ $ gives $$ e^{\beta\lambda+1}=\sum_a p_0(a)e^{\beta U(w,a)}=Z(w)\ , $$ and so $$ x_a^*=\frac{p_0(a)e^{\beta U(w,a)}}{Z(w)}\ , $$ as given in the paper cited.
The second partial derivatives of the objective, which are just the same as $\ \frac{\partial^2 \mathcal{L}}{\partial x_a\partial x_b}\ $ in this case, are given by $$ \frac{\partial^2 \mathcal{L}}{\partial x_a\partial x_b}(x,\lambda) = -\frac{\delta_{ab}}{\beta x_a}< 0 , $$ provided $\ \beta>0\ $. In that case the Hessian is negative definite over the whole domain of the objective, the objective is strictly concave, and $\ x^*\ $ is the unique global maximiser. If, instead, $\ \beta < 0\ $, the objective function would be strictly convex, and $\ x^*\ $ would be the global minimiser.