Training loss on diffusion (from "Deep Unsupervised Learning using Nonequilibrium Thermodynamics")

117 Views Asked by At

In a Paper "Deep Unsupervised Learning using Nonequilibrium Thermodynamics", there is a development of the formula like below: enter image description here

I don't quite understand the omissions in the above formula. In particular, how does it get from (11) to (12)?

Can anyone shed more light on the development of the above formula?

1

There are 1 best solutions below

2
On BEST ANSWER

Using the Jensen's inequality, as indicated, it follows $$ \log \left[ \int d\mathbf{x}^{(1\ldots T)} h \left( \mathbf{x}^{(1\ldots T)} \right) q \left(\mathbf{x}^{(1\ldots T)}|\mathbf{x}^{(0)} \right) \right] \geq \int d\mathbf{x}^{(1\ldots T)} \log h \left( \mathbf{x}^{(1\ldots T)} \right) q \left(\mathbf{x}^{(1\ldots T)}|\mathbf{x}^{(0)} \right) $$ since $ \int d\mathbf{x}^{(1\ldots T)} q \left(\mathbf{x}^{(1\ldots T)}|\mathbf{x}^{(0)} \right) =1 $ and $\log$ is concave.

Thus a lower bound of the LHS term in (11) is $$ \int d\mathbf{x}^{(0)} q \left(\mathbf{x}^{(0)}\right) \int d\mathbf{x}^{(1\ldots T)} q \left(\mathbf{x}^{(1\ldots T)}|\mathbf{x}^{(0)} \right) \log h \left( \mathbf{x}^{(1\ldots T)} \right) $$ which is (12).