Jensen's inequality in derivation of EM algorithm

278 Views Asked by At

I am going through the derivation of EM algorithm and got stuck on understanding the following steps: Notes showing EM algortithm derivation

For the equality to hold, f(x) has to be an affine function. Why does setting $q(x)=p(x|z,\theta)$ make f(x) an affine function?

1

There are 1 best solutions below

2
On

It said $f(x) = log \frac{p(x,z|\theta}{q(x)}$ be affine, it means the linear function with the translation. If $q(x) = p(x|z, \theta) = p(x, z|\theta)p(z|\theta)$, then $f(x) = log(\frac{1}{p(z|\theta)})$ is a constant indepent of x, so it is affine function. The choice of q(x) this way af is just one choice to get Jensen's inequality to achive equality.