I am learning about Linear regression with Gaussian noise, and the author wrote the statement:
$$p(y|\mathbf{x}, \mathbf{\theta})=N(y|\mu(\mathbf{x}), \sigma(\mathbf{x}) )$$
I am a bit confused here about the conditional inside the normal distribution (i.e. $y|\mu(\mathbf{x}$)). From what I know, the normal distribution is parametrized by $\mu$ and $\sigma$ but it seems here that he parametrizes it with $y$ given $\mu$?? Can someone give me a simple example of what he means here?
Every $y$ given $x$ can follow a different normal distribution with parameter $\mu(x)$ and $\sigma(x)$.
At $x=1$, $y|x=1$ can follow normal distribution mean $0$ and standard deviation $1$.
but at $x=2$, $y|x=2$ can follow normal distrbution with mean $\pi$ and standard deviation $\sqrt{42}.$