In every textbook, they used to write a likelihood function without explaining it. I am thoroughly confused, so much so that I am cramming different likelihood functions.
For eg., while calculating the likelihood function of linear regression we assume the model:
$y=w'x+\epsilon$ where $\epsilon$~$N(0,1/\beta)$
And for $D_n=\{(x_i,y_i):i=1,...n\}$ i.i.d samples
$L(w,\beta)=\Pi^n_{i=1} \sqrt{\beta/2\pi}\ exp(-\beta(y_i-w'x)^2/2)$
And for logistic regression:
$L(p_o,\mu_o,\mu_1,\Sigma)=\Pi^n_{i=1}(p_of_o(x_i))^{y_i}((1-p_o)f_1(x_i))^{1-y_i}$
where $D_n=\{(x_i,y_i):i=1,...n\}$ i.i.d samples $y_i=0 \ or \ 1$, $f_o$~$N(\mu_0,\Sigma)$ and $f_1$~$N(\mu_1,\Sigma)$
Here, $f_o=p(x_i|y_i=0)$ and $f_1=p(x_i|y_i=1)$. Why are we considering prior($p_o$) here suddenly? Why we haven't considered in linear regression case. Also, how does the likelihood of logistic regression came? Is there a general way of getting a likelihood function for any model? I am a beginner in data science, please consider this while giving answer.
I have gone through different blogs before posting here but everywhere after defining the model they somehow magically writting the expression of likelihood function and starts maximizing it.(maximum likelihood estimation). Can anyone please, uncover the steps in between defining a model and getting its likelihood function?