Naive Bayess classifier probability out of range

49 Views Asked by Bumbble Comm At 30 Mar 2026 - 3:00

While deriving Naive Bayess classifier formula, specifically on step $$p(C | F_1,...,F_N) = \frac{(\prod_{i=1}^{N}p(F_i|C)) p(C)}{p(F_1,...,F_N)}$$ such an idea came to me: what if $F_1,...,F_N$ are pairwise independent events, then$$p(C | F_1,...,F_N) = \frac{(\prod_{i=1}^{N}p(F_i|C)) p(C)}{\prod_{i=1}^{N}p(F_i)} = \Big(\prod_{i=1}^{N}(\frac{p(F_i | C)}{p(F_i)}\Big)p(C)$$ but then if $p(F_i | C) > p(F_i)$ we have product of $N$ number larger than 1, so if $N$ is sufficiently large, this product can be arbitrary large and whole expression can be greater than 1, which cannot happen to probability.

Obviously, I'm wrong somewhere, but I cannot see where, can somebody help?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 03 Jul 2020 - 5:30

First of all, I think you meant what if F₁,...,F_n are mutually independent.
Secondly, think about it this way, if p(F_i|C_k) is really high, but p(F_i) is really low, what does that tell you about p(C_k)? It would have to be really low. More formally, use Bayes rule for p(F_i|C_k).

p(F_i|C_k)=$\frac{p(F_i)}{p(C_k)}p(C_k|F_i)$

$\frac{p(F_i|C_k)}{p(F_i)}$=$\frac{p(C_k|F_i)}{p(C_k)}$

Now let $p(F_i|C_k)=\epsilon$ and $p(F_i|C_k)=1-\epsilon$ for some arbitrary $\epsilon$ between 0 and 1.

$\frac{1-\epsilon}{\epsilon}=\frac{p(C_k|F_i)}{p(C_k)}$

$(1-\epsilon)p(C_k) = \epsilon p(C_k|F_i)$

Now you have it, if epsilon is super small, then
$p(C_k) \approx \epsilon p(C_k|F_i)$

So the $p(C_k)$ term would ensure that the final probability is less than or equal to 1.

Naive Bayess classifier probability out of range

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in BAYES-THEOREM

Trending Questions

Popular # Hahtags

Popular Questions