I just lead the Naive Bayes learning, the form is
$$ P(y, x_1, \dotsc, x_n) = p(y) \prod_{i=1}^n p(x_i \mid y). $$
In this lecture, it says
Each factor $ p(x_i \mid y) $ can be completely described by a small number of parameters (4 parameters with 2 degrees of freedom to be exact). The entire distribution is parametrized by O(n) parameters, which we can tractably estimate from data and make predictions.
My question is
- What is the model parameters for Bayes learning.
- How to understand: $ p(x_i \mid y) $ can be completely described by a small number of parameters (4 parameters with 2 degrees of freedom to be exact)?
Thanks.