I have a few questions on conditional Gaussian distributions that I'm hoping to get some clarity on.
Let $P_{X|Y}(x|y) = N(x\:;y\:, 1)$
- Can we deduce anything about X and Y with no further information?
- Is this equivalent to saying $X|Y = Y + \epsilon$ where $\epsilon$ is normally distributed?
The reason that I have these questions is that I'm reading literature which incorporates a Markov chain for a series of timesteps. It is stated that $P(\textbf{x}_t | \textbf{x}_{t-1}) \sim N(\:\textbf{x}_{t-1}, \mathbb{I}\:)$, and all of the literature refers to each timestep as "adding Gaussian noise" to the previous one.
In general, In understand how this can be viewed as adding noise to the previous timestep because we are centering our conditional distribution on the value of the previous timestep and can therefore expect to be close to the previous timestep, but I don't see how we can guarantee that this noise is Gaussian.
Does anyone have insight as to what I am missing? Or are the papers I am reading not 100% accurate. I should note that these papers are not in pure math but in machine learning, so I don't think it would be completely out of the question that there is a misphrasing.
Thanks in advance!