I have seen different derivations of KL divergence problems. Up to here, I understand everything:
Then in next step we get:
See:
But this just can not be true:
I am sure I am missing something, but I just can not understand it what.
additional information:
I do not understand How Q(z) * log(P(x)) becomes log(P(x)). My question would be why Q(z) * log(P(z,x)/Q(z)) doesnt become just log(P(z,x)/Q(z)) if sum of Q(z) is always one as we know.
In lower part I have presented this with two sets of random variables for Q and P distribution. When applying Q(z) * log(P(x)) with these random variables, I get completely different result if there would be only log(P(x)).
qx = x only if q = 1, but in my case as it is evident qx is not x, therefore q is not one.