Gaussian Naive Bayes Classifier - Need help understand formulation

240 Views Asked by At

I'm reading this lecture notes about gaussian naive bayes classifier:ohio state CSE 788.04

On Section 1.1, the formulation for gaussian naive bayes classifier, he said that $X=<X_1, ...X_n>$, where each $X_i$ is a continous random variable.

But in equation $5$:

$$P(Y=1|X)=\frac{P(Y=1)P(X|Y=1)}{P(Y=1)P(X|Y=1)+P(Y=0)P(X|Y=0)}$$

Why wasn't chain rule for probability applied, since $X$ is now a vector of conditions instead of 1 condition? (Question 1)

Then again in equation $7$:

$$P(Y=1|X)=\frac{1}{1+exp(ln(\frac{P(Y=0)P(X|Y=0)}{P(Y=1)P(X|Y=1)}))}$$

Becomes $8$:

$$P(Y=1|X)=\frac{1}{1+exp(ln(\frac{P(Y=0)}{P(Y=1)}+\sum(ln(\frac{P(X_i|Y=0)}{P(X_i|Y=1})))}$$

From 7 to 8, it looks like he used natural log properties to split into two variables (Question 2):

$$ln(\frac{P(Y=0)P(X|Y=0)}{P(Y=1)P(X|Y=1)})$$

$$=ln(\frac{P(Y=0)}{P(Y=1)})+ln(\frac{P(X|Y=0)}{P(X|Y=1)})$$

But I'm not sure where did the summation come from.

Can anyone point me to methods and techniques used for Question 1 and 2?

1

There are 1 best solutions below

2
On BEST ANSWER

In Equation 7, consider the numerator in the argument of $ln$.

$$P(X|Y=0) = P(X_1, X_2, \dots, X_n | Y=0) = \prod_{i=1}^n P(X_i|Y=0),$$

where the last equality is allowed since $X_j$ and $X_k$ are independent (conditioned on $Y = 0$) for all $j, k = 1, \dots, n$ . This independence is an assumption of the Naive Bayes model.

The same can be said of the denominator:

$$P(X|Y=1) = \prod_{i=1}^n P(X_i|Y=1).$$

From these two equations, the summation in Equation 8 arises from

\begin{align} ln \left( \frac{P(X|Y=0) \cdot P(Y=0)}{P(X|Y=1) \cdot P(Y=1)} \right) &= ln \left( \frac{\prod_{i=1}^n P(X_i|Y=0) \cdot P(Y=0)}{\prod_{i=1}^n P(X_i|Y=1) \cdot P(Y=1)} \right) \\ &= \sum_{i=1}^n ln(\frac{P(X_i|Y=0)}{P(X_i|Y=1)}) + ln \left( \frac{P(Y=0)}{P(Y=1)} \right). \end{align}