Consider two Normal distributions$$ \begin{align} P(X_1)&=\mathcal N(0,\sigma^2)=\epsilon\sigma\\ P(X_1|X_2)&=\mathcal N(f(x_2),\sigma^2)=f(x_2)+\epsilon \sigma \end{align} $$where $\epsilon\sim \mathcal N(0,1)$
Now, I have another Normal distribution that is conditioned on $X_1$:$$ P(Y|X_1)=\mathcal N(g(x_1),\sigma^2) $$ My question is, which parameterization of $X_1$ do I sample to compute $Y|X_1$ ? The first or second ? Do I need further contexts for $P(Y|X_1)$ or are the given informations sufficient?
I am studying denoising diffusion models and this problem arises in one of its derivation.
Since $Y$ only depends on $X_1$ why would you use $X_2$ to indirectly calculate $Y$?
It depends on what you expect to know -- do you even know $X_1$ or are you only given $X_2$? And is $X_2$ fixed or will vary between time periods?
If you only will have direct access to $X_2$ then you have to use the second option. If you have $X_1$ then there is no reason not to use $X_1$ directly.