Naive bayes: Log odds derivation

822 Views Asked by At

How does one go from line 17 to 18 in the below picture? i.e. conversion to linear function of the input variable.

enter image description here

Source: http://pages.cs.wisc.edu/~jerryzhu/cs769/nb.pdf

1

There are 1 best solutions below

0
On

The question was: Why is the following true? $$ \log p(y = 1 | x) - \log p(y = 0 | x) = (\log \theta_1 - \log \theta_0)^T x + (\log p(y = 1) - \log p(y = 0)) $$


From the problem statement, we are given the distribution of the output $x$, given the input $y$. It is: $$ p (x | y) = \textrm{const}\cdot\prod_{w = 1}^{v} \theta_{y_w}^{c_w}. $$ From the problem statement, $$ x = \begin{bmatrix} c_1 & c_2 & \cdots & c_v\end{bmatrix}^T. $$ Then, notice that $x,\theta_y \in \mathbb{R}^v$. $$ \begin{aligned} \log p(x | y) & = \log \textrm{const} + \log \prod_{w = 1}^{v} \theta_{y_w}^{c)w} \\ & = \log \textrm{const} + \sum_{w = 1}^{v} \log \theta_{y_w}^{c_w} \\ & = \log \textrm{const} + \sum_{w = 1}^{v} c_w \log \theta_{y_w} \\ & = \log \textrm{const} + \begin{bmatrix} c_1 & c_2 & \cdots & c_v \end{bmatrix}\begin{bmatrix} \log \theta_{y_1} \\ \log \theta_{y_2} \\ \vdots \\ \log \theta_{y_v}\end{bmatrix} \\ & = \log \textrm{const} + x^T \log \theta_y \\ & = \log \textrm{const} + [\log \theta_y]^T x \end{aligned} $$ We have used the substitution: $$ \log \left(\begin{bmatrix} \theta_{y_1} \\ \theta_{y_2} \\ \vdots \\ \theta_{y_v}\end{bmatrix} \right) = \begin{bmatrix} \log \theta_{y_1} \\ \log \theta_{y_2} \\ \vdots \\ \log\theta_{y_v}\end{bmatrix} $$ Then, $$ \log p(x | y = i) = \log \textrm{const} + [\log \theta_i]^T x. $$ My understanding is if $y$ is denoted as $0$ or $1$, then the entire vector of parameters $\theta_y$ changes to some pre-determined and known vector corresponding to a $0$ state or $1$ state.

Now, going back to the first line of the question, $$ \begin{aligned} \log p(y = 1 | x) - \log p(y = 0 | x) & = \log \frac{p(y = 1 | x)}{p(y = 0 | x)} \\ & = \log \left(\frac{\frac{p(y = 1 , x)}{p(x)}} {\frac{p(y = 0 , x)}{p(x)}}\right)\\ & = \log \frac{p(y = 1 , x)}{p(y = 0 , x)} \\ & = \log \frac{p(x | y = 1)\cdot p(y = 1)}{p(x | y = 0)\cdot p(y = 0)} \\ & = \log \frac{p(x | y = 1)}{p(x | y = 0)} + \log \frac{p(y = 1)}{p(y = 0)} \\ & = \log p(x | y = 1) - \log p(x | y = 0) + \log p(y = 1) - \log p(y = 0) \\ & = \log \textrm{const} + [\log \theta_1]^T x - \log \textrm{const} - [\log \theta_0]^T x + \log p(y = 1) - \log p(y = 0) \\ & = ([\log \theta_1]^T - [\log \theta_0]^T)x + \log p(y = 1) - \log p(y = 0) \\ & = (\log \theta_1 - \log \theta_0)^T x + \log p(y = 1) - \log p(y = 0) \end{aligned} $$