Factoring integral term on joint expectation

13 Views Asked by At

I am reading a paper titled Transformer can do Bayesian inference. I am lost in proof of insight 1, in which they derive (equation 3):

$$-\int_{D,x,y}p(x,y,D)\log q_{\theta}(y|x,D) = -\int_{D,x}p(x,D)\int_{y}p(y|x,D)\log q_{\theta}(y|x,D)$$

How do they arrive with the term on the right-hand side? is this from the definition of total expectation? or some integral rules?