How data goes to the other side in maximum likelihood estimation

17 Views Asked by At

I am reading the book Mathematics for Machine learning and I am a bit confused with maximum likelihood estimation. I understand that the likelihood is the probability of get certain observations $x$, given a set of parameters $\theta$: $$p(x|\theta)$$ So, find a maximum value implies that you have chosen the set of parameters for which data is more likely to have been measured. But then book gives an example for linear regression where it assumes a gaussian likelihood function and it says that it is: $$p(y_n|\mathbf{x}_n, \mathbf{\theta}) = \mathcal{N}(y_n|\mathbf{x}_n^\intercal\mathbf{\theta}, \sigma^2)$$ What I dont understand is how data $\mathbf{x}_n$ now is in the other side of $|$. I think that $\mathbf{x}_n$ is part of the data observed, along with $y_n$, and $\mathbf{\theta}$ the parameters that should be on the right side of the conditional probability.