Proof of the information bottleneck equations

68 Views Asked by At

In The Information Bottleneck Method, the third term of Eq.(31) is $P_{t+1}(y|\tilde{x})=\sum_yp(y|x)p_t(x|\tilde{x})$, which minimizes the term $D_{KL}[p(y|x)|p(y|\tilde{x})]_{<p(x,\tilde{x})>}$.

I think the sum over $y$ in the equation doesn't make sense. Should it be $x$?

The derivative of $D_{KL}[p(y|x)|p(y|\tilde{x})]_{<p(x,\tilde{x})>}+\lambda(x)(\sum_y(p(y|\tilde{x})-1))$ gives $-\sum_xp(x,\tilde{x})p(y|x)\frac{1}{p(y|\tilde{x})}+\lambda(x)$. How does this term give the desired equation?