In Murphy's book, page 242-243, it is asked to describe the advantages and disadvantages of the generative model for linear regression, compared with the standard discriminative approach.
I'm looking for a complete answer to this, since it is not clear what to say.
What I got so far:
Disvantages:
It is an indirect method, since we have to estimate the joint distribution $Y,\mathbf{X}$ before finding the optimal predictor;
There are too many parameters to estimate, namely $\frac{(p+1)(p+2)}{2}$ in the covariance matrix $\Sigma$ of the joint distribution, plus the $p+1$ parameters for the mean of $\textbf{X}$ and $Y$.
Advantages:
Better accuracy for small training sets;
One can use the joint distribution to generate new data similar to the existing data.
There is also one disadvantage that I don't understand so well: it is said that the number of needed parameters is $p+1$. I think that this refers to $\textbf{w}$ and $w_0$, but I'm not sure.
Thanks in advance.