How did it compute the gradient of Bayesian expression?

68 Views Asked by At

From Bayes Theorem:

\begin{equation} p(x|y) = \frac{p(x) p(y|x)}{p(y)} =\frac{p(x) p(y|x)}{\int p(x) p(y|x)} \end{equation}

If we take gradients with respect to $x$ in both sides of the equation we obtain: \begin{equation} \nabla_x \log p(x|y) = \nabla_x \log p(x) + \nabla_x \log p(y|x) \end{equation}

Someone could explain me how did he reach here? My level of math is not very high but as far as I understand after applying the chain rule I got the following expression:

\begin{equation} \nabla_x \log p(x|y) = \frac{p(y|x)}{p(y)}\nabla_x \log p(x) + \frac{p(x)}{p(y)}\nabla_x \log p(y|x) \end{equation}

2

There are 2 best solutions below

0
On BEST ANSWER

He's just using the properties of logarithms, then differentiating.

\begin{align} p(x|y) &= \frac{p(x) p(y|x)}{p(y)}\\ \implies\log{p(x|y)} &= \log{\frac{p(x) p(y|x)}{p(y)}}\\ &= \log{p(x)} + \log{p(y|x)} - \log{p(y)}\\ \implies \nabla_x\log{p(x|y)} &= \nabla_x\log{p(x)} + \nabla_x\log{p(y|x)} - \nabla_x\log{p(y)}\\ &= \nabla_x\log{p(x)} + \nabla_x\log{p(y|x)}\\ \end{align}

because $p(y)$ is constant with respect to $x$. Probabilists make it clear when a function $p$ is a function of multiple variables. You can safely assume that $p(y)$ is not a function of x; indeed if it were, Bayes's Theorem would not apply.

0
On

A more detailed proof would go as follows:

$ \nabla_x \log p(x|y) = \nabla_x (\log p(x|y) + \log p(y))= \nabla_x (\log p(x|y) p(y)) = \nabla_x (\log p(x, y)) = \nabla_x (\log p(x) p(y | x)) = \nabla_x \log p(x) + \nabla_x \log p(y | x) $