Suppose a model, $$y = x + \eta$$
In the engineering terms, think of $y$ as the observed data, $x$ is the desired (unknown) object, and $\eta$ as the noise. So, we have a noisy observation. Suppose $\eta$ is a random variable and has a density $\mathcal{N}(\eta;0,1)$. People, when doing modeling for formulating an inverse problem, generally write the generative model of this observation process as follows.$$p(y|x) = \mathcal{N}(y;x,1)$$ It is intuitive that, if the noise is zero mean, then you can assume $y$ is a Gaussian random variable with mean $x$ when $x$ treated as a deterministic quantity. However, I want to do this trick rigorously. I mean, in the latter probabilistic model, there is nothing about the $\eta$, we `deleted' it from the model somehow.
(Noting the relation to inverse problems: $p(y|x)$ is the likelihood and by putting priors, people generally estimate some features of the posterior distribution by using Bayes theorem. This approach is called the Bayesian approach to inverse problems; however what I want to do is doing this trick rigorously only - measure theoretic, if necessary)
If somebody points out a reference or give a proof, I will be very happy. Thanks.
Here is one proof:
Let $y(t;\eta)$ be a random process with respect to the random parameter $\eta$. Assume $\eta$ has a standard Gaussian distribution, and for the time being, assume only that $y$ has a mean-squared convergent distribution (i.e. $y$ has a finite second moment).
Using the Wiener-Askey polynomial chaos, we can write $\eta$ in terms of a random variable $\zeta$ belonging to a distribution of our choosing:
$$\eta = \sum_{i=0}^\infty h_i \Phi_i(\zeta).$$
Similarly, we may do the same for $y$:
$$y = \sum_{i=0}^\infty y_i \Phi_i(\zeta).$$
Choose $\zeta$ to be a standard normal r.v., and then $\Phi_i(\zeta)$ are the Hermite polynomials. Using the Galerkin method, we project the expansion onto each basis function $\Phi_i$. It is not hard to show that the projection onto $\eta$ annihilates $\eta_i$ for $i > 1$, leaving
$$\eta = \eta_0 + \eta_1\zeta,$$
and it is a natural consequence that $\eta_0$ is the mean of the distribution, and $\eta_1$ is the standard deviation.
Substituting this into the definition of the process, we have
$$\sum_{i=0}^\infty y_i\Phi_i(\zeta) = y\left(t;\sum_{i=0}^1 \eta_i \Phi_i(\zeta)\right) = x+\sum_{i=0}^1 \eta_i \Phi_i(\zeta).$$
Applying the Galerkin method again, we find that $y_0 = x+\eta_0$ and $y_1 = \eta_1$. But since $\eta_1 = 1$, then $y_1 = 1$, and since $\eta_0 = 0$, $y_0 = x$.
Hence, we've effectively removed the dependence on $\eta$, because $\eta$ is zero-mean. This implies that the only effect of $\eta$ is in the variance/standard deviation, but since you assume it to be 1, then it just sort of washes out.
Therefore, your conditional probability statement is essentially just telling you to shift your distribution by $x$ to get the influence of $\eta$.