How is the continuous input probabalistic generative model derived from single class model?

Question

How is the continuous input probabalistic generative model derived from single class model?

29 Views Asked by Bumbble Comm At 27 Mar 2026 - 1:13

So for the single valued model we have:

$p(C_1|\textbf{x}) = \frac{p(\textbf{x}|C_1)p(C_1)}{p(\textbf{x}|C_1)p(C_1)+p(\textbf{x}|C_2)p(C_2)}$

If we rearrange the terms, we can write this as a sigmoid function:

$\frac{1}{1+exp(-a)}=\sigma(a)$...(4.57)

where $a=ln \frac{p(\textbf{x}|C_1)p(C_1)}{p(\textbf{x}|C_2)p(C_2)}$...(4.58)

Then we moved onto the continuous input case where we assumed $p(C_k|\textbf{x})$ was gaussian:

$p(\textbf{x}|C_k) = \frac{1}{(2\pi)^{D/2}}\frac{1}{|\Sigma|^{1/2}}exp(-\frac{1}{2}(\textbf{x}-\mu_k)^T\Sigma^{-1}(\textbf{x}-\mu_k))$ ...(4.64)

Then I became confused when the text said, using 4.57 and 4.58 we have: $p(C_1|\textbf{x}) = \sigma(\textbf{w}^T\textbf{x}+w_0)$

where:

$\textbf{w} = \Sigma ^{-1}(\mu_1-\mu_2)$

$w_0=-\frac{1}{2}\mu_1^T\Sigma^{-1}\mu_1+\frac{1}{2}\mu_2^T\Sigma^{-1}\mu_2+ln\frac{p(C_1)}{p(C_2)}$

Is it saying that if I plug everything in the sigmoid I will recover from it $p(C_1|\textbf{x}) = \frac{p(\textbf{x}|C_1)p(C_1)}{p(\textbf{x}|C_1)p(C_1)+p(\textbf{x}|C_2)p(C_2)}$

but $p(x|C_1)$ and $p(x|C_2)$ are the normal distributions like in 4.64? Why can't we just use the Bayes theorem as is? Why do we have create a sigmoid function out of seemingly no where?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

"Can't we just use the Bayes theorem as it is?"

Yes we can. And yes, Bayes' theorem does indeed reproduce the formula you quoted:

\begin{align} p(C_1|\mathbf x) & = \frac{p(\mathbf x | C_1) p(C_1)}{p(\mathbf x | C_1) p(C_1) + p(\mathbf x|C_2) p(C_2)} \\ & = \frac{p(C_1)\exp\left( - \tfrac 1 2 (\mathbf x - \mu_1)^T\Sigma^{-1}(\mathbf x - \mu_1) \right)}{p(C_1) \exp\left( - \tfrac 1 2 (\mathbf x - \mu_1)^T\Sigma^{-1}(\mathbf x - \mu_1) \right)+ p(C_2) \exp\left( - \tfrac 1 2 (\mathbf x - \mu_2)^T\Sigma^{-1}(\mathbf x - \mu_2) \right)} \\ & = \frac{\exp\left( ( \mu_1 - \mu_2)^T\Sigma \mathbf x - \tfrac 1 2 \mu_1^T\Sigma \mu_1+ \tfrac 1 2 \mu_2^T\Sigma \mu_2 + \ln \tfrac {p(C_1)}{p(C_2)} \right)}{\exp\left( ( \mu_1 - \mu_2)^T\Sigma \mathbf x - \tfrac 1 2 \mu_1^T\Sigma \mu_1+ \tfrac 1 2 \mu_2^T\Sigma \mu_2 + \ln \tfrac {p(C_1)}{p(C_2)} \right) + 1} \\ & = \sigma(\mathbf w^T \mathbf x + w_0) \end{align} [NB in the second line, I omitted the factors of $\frac{1}{(2\pi \det \Sigma)^{d/2}}$ in the numerator and denominator - they cancel out.]

"What is the benefit in writing it in this way?"

Writing the result in this way makes it clear that the decision as to which class $\mathbf x$ is most likely to belong to is given in terms of a linear function of $\mathbf x$:

$$ p(C_1|\mathbf x ) > \tfrac 1 2 \ \iff \ \sigma(\mathbf w^T \mathbf x + w_0) > \tfrac 1 2 \ \iff \ \mathbf w^T \mathbf x + w_0 > 0.$$

In other words, the decision boundary is the linear hyperplane, $\mathbf w^T \mathbf x + w_0 = 0$.

How is the continuous input probabalistic generative model derived from single class model?

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in MACHINE-LEARNING

Related Questions in BAYESIAN

Trending Questions

Popular # Hahtags

Popular Questions