Gaussian Naive Bayes Classifier - Formulation of Weights

496 Views Asked by At

I'm trying to understand this lecture not on how Gaussian naive Bayes classifier was derived: ohio state CSE 788.04.

I'm not sure how the weights are derived on equation 17, 18, 19.

$$P=(Y=1|X)=\frac{1}{1+exp(ln\frac{1-\pi}{\pi}+\sum_i{\frac{\mu_{i0}-\mu_{i1}}{\sigma^2_i}X_i+\frac{\mu^2_{i1}-\mu^2_{i0}}{2\sigma^2_i})}}$$

Somehow is equivilant to :

$$P(Y=1|X)=\frac{1}{1+exp(w_0+\sum^n_{i=1}w_iX_i)}$$

Where:

$$w_i=\frac{\mu_{i0}-\mu_{i1}}{\sigma^2_i}$$

$$w_0=ln\frac{1-\pi}{\pi}+\sum_i\frac{\mu^2_{i1}-\mu^2_{i0}}{2\sigma^2_i}$$

I'm not sure why the weights are formulated. For each weight that is not the intercept, $w_i$, then weights are the difference of mean a Gaussain distribution $X_i$ when $Y=0$ and $Y=1$'s probability divided by $X_i$'s standard deviation. I'm not sure if I quite understand that?

2

There are 2 best solutions below

0
On BEST ANSWER

Here, his intention is not to derive the weights in a any specific form.

His only objective is to show that the Logistic model(a discriminative model) can be obtained from Naive Bayes, a generative model. Because he has assumed normality for $X_{i}$'s, the class conditional distributions become Normal distributions for each $X_{i}$. Pugging these densities into the Bayes rule(equation-5) and simplifying, he was able to show that the logistic function can be obtained from Naive Bayes. The weights for the variables $X_{i}$s are obtained as in the stated form, as a part of algebra and nothing else. It has no statistical significance.

He has clearly mentioned towards the end of that section(page 3), the reason for using the discriminative model over the generative model.

1
On

The argument in the exponential function is

\begin{equation} \ln \left( \frac{1-\pi}{\pi} \right)+\sum_{i=1}^{n} \left( \frac{\mu_{i0}-\mu_{i1}}{\sigma_i^2}X_i + \frac{\mu_{i1}^2 - \mu_{i0}^2}{2\sigma_i^2} \right). \end{equation}

Note that the summation is over both fractions, since there are terms in both that depend on $i$. The above can then be written as

\begin{equation} \ln \left( \frac{1-\pi}{\pi} \right)+\sum_{i=1}^{n} \frac{\mu_{i0}-\mu_{i1}}{\sigma_i^2}X_i + \sum_{i=1}^{n} \frac{\mu_{i1}^2 - \mu_{i0}^2}{2\sigma_i^2}. \end{equation}

Rearranging, we get \begin{equation} \underbrace{\ln \left( \frac{1-\pi}{\pi} \right)+ \sum_{i=1}^{n} \frac{\mu_{i1}^2 - \mu_{i0}^2}{2\sigma_i^2}}_{w_0} + \sum_{i=1}^{n} \underbrace{\frac{\mu_{i0}-\mu_{i1}}{\sigma_i^2}}_{w_i}X_i, \end{equation}

which gives the expressions for the weights.