Derivative of Cross Entropy of Finite Discrete Random Variables

53 Views Asked by At

Consider the following definition of cross entropy on 2 independent finite discrete random variables $X$ and $Y$ with respective probability mass functions $p$ and $q$ defined from set $\{a_i|i=1,...,n\}$ to $[0,1]$ ( for the sake of simplicity, let $p_i=p(a_i)$ and $q_i=q(a_i)$ ):

$$H(p,q)=-\sum_{i=1}^{n}p_i*log(q_i)$$

Now my question is, how does one go about calculating $\frac{\partial H(p,q)}{\partial q_i}$ ? My progress so far:

\begin{equation} \label{eq:eq1} \begin{split} \frac{\partial H(p,q)}{\partial q_i} & = \frac{\partial}{\partial q_i}\left(-\sum_{j=1}^{n}p_j*log(q_j) \right) \\ & = -\sum_{j=1}^{n}p_j*\frac{\partial log(q_j)}{\partial q_i} \\ & = -\sum_{j=1}^{n}p_j*\frac{\partial log(q_j)}{\partial q_j}*\frac{\partial q_j}{\partial q_i} \\ & = -\sum_{j=1}^{n}\frac{p_j}{q_j}*\frac{\partial q_j}{\partial q_i} \\ & = -\frac{p_i}{q_i}-\sum_{j=1,\,j\neq i}^{n}\frac{p_j}{q_j}*\frac{\partial q_j}{\partial q_i} \end{split} \end{equation}

We know that, $$q_j=1-\sum_{k=1,\,k\neq j}^{n}q_k$$ This is the point where I get stuck such that I do not know what my next move to simplify the derivative equation above is. Can anyone help me solve it or point me towards the right direction?