Question about KL divergence and this formula.

72 Views Asked by At

I was reading papers : https://arxiv.org/pdf/1503.03585.pdf for understanding https://arxiv.org/pdf/2006.11239.pdf (this paper is about denoising diffusion model)

But I can't figure how the author derive this

enter image description here

from

enter image description here

enter image description here

The part that I can't understand is the KL divergence part, I can't figure out how $dx^{(0...T)}$ is changed to $dx^{(0)}dx^{(t)}$ $$\sum_{t=2}^T\int dx^{(0...T)}q(x^{(0...T)})log[\frac{p(x^{(t-1)} | x^{(t)})}{q(x^{(t-1)} | x^{(t)},x^{(0)})}] $$

$$\sum_{t=2}^T\int dx^{(0)}dx^{(t)}q(x^{(0)},x^{(t)})D_{KL}({q(x^{(t-1)} | x^{(t)},x^{(0)})} || p(x^{(t-1)} | x^{(t)})) $$

And I can't understand how the author derive 22 from 21 below this sentence.

enter image description here

enter image description here

Can anyone help me to understand how these formulas are derived?