Partial derivative of summation over function

94 Views Asked by At

Hi I'm trying to understand an algorithm that involves maximizing a function over a probability distribution i.e. $\pi(a)$ of a random variable $a$.

I don't understand why the first term of the partial derivative of the lagrangian w.r.t. $\pi(a)$

$$\frac{\partial}{\partial\pi(a)} \sum_a\pi(a)r(a)$$

equals to $$r(a)$$

where $\pi(a)$ is a probability distribution and $a\in A$ which is a finite set of actions that the agent can take.

Both the summation and the $\pi(a)$ function get eliminated in the above equation. Why is that?


We want to find: $$\max_{\pi(a)} E \left[ r(a)\right] + \beta \mathcal{H}(\pi(a)) $$ i.e. $$\max_{\pi(a)} \sum_{a} \pi(a) r(a) - \beta \sum_{a} \pi(a) \ln \pi(a) $$

$$\max_{\pi(a)} \min_{\lambda} \mathcal{L}(\pi(a),\lambda)=\sum_{a} \pi(a) r(a) - \beta \sum_{a} \pi(a) \ln \pi(a) + \lambda \bigl( \sum_{a}\pi(a)-1 \bigr)$$

Taking the Lagrangian.

$$\frac{\partial}{\partial \pi(a)}\mathcal{L}(\pi(a),\lambda)=0 \text{ , }\frac{\partial}{\partial \lambda}\mathcal{L}(\pi(a),\lambda)=0$$

$$\frac{\partial}{\partial \pi(a)} \sum_{a}\pi(a)r(a) - \beta \sum_{a}\pi(a) \ln \pi(a) + \lambda \bigl( \sum_{a}\pi(a)-1 \bigr) =0 $$ $\sum_{a}\pi(a)=1 $ as we are summing over all probabilities

$$r(a) - \beta \ln \pi(a) - \beta +\lambda =0 $$

$$\beta \ln \pi(a) = r(a) - \beta +\lambda$$ $$\pi(a) = exp \bigl [ \frac{1}{\beta}(r(a) - \beta +\lambda) \bigr ]$$ $$\pi(a) = \frac{1}{Z}exp \bigl [ \frac{r(a)}{\beta} \bigr ] \text{ where } Z = \sum_{a} exp \bigl [ \frac{r(a)}{\beta} \bigr ]$$

For reference, the original formulas are:

[formulars][1] [1]: https://i.stack.imgur.com/7SLJw.png