Question about time-reversal of Markov Chain with Gaussian transitions

94 Views Asked by At

Suppose that $X_0$ has distribution $Q$. I can generate a Markov Chain starting at $X_0$ by adding I.I.D. Gaussian noise at each step of the chain: $$X_t = X_{t-1} + \epsilon_t$$ where $\epsilon_t \sim \mathcal{N}(0, \sigma^2)$. In this way, $$p(x_t \mid x_{t-1}) = \mathcal{N}(x_{t-1}, \sigma^2).$$

Isn't it true that $p(x_{t-1} \mid x_t) = \mathcal{N}(x_t, \sigma^2)$? That is, if I have a sample from $p(x_t)$ and I want a sample from $p(x_{t-1})$, can't I just subtract Gaussian noise?

I'm trying to learn about latent diffusion models. In this context, we think of $X_0$ as coming from some data-generating distribution, and we use a neural network to model $p(x_{t-1} \mid x_t)$. Obviously there would be no need to do this if it's straightforward to calculate $p(x_{t-1} \mid x_t)$, so what am I missing here?

In this paper: https://arxiv.org/pdf/2006.11239.pdf, the conditional distribution is given by: $$p(x_t \mid x_{t-1}) = \mathcal{N}(\sqrt{1 - \beta_t}x_{t-1}, \beta_t I)$$ so we would say that $\epsilon_t \sim \mathcal{N}((\sqrt{1 - \beta_t} - 1)x_{t-1}), \beta_tI)$. In this case, the noise we add at time step $t$ is not independent of $X_{t-1}$.