Using the Naive Bayes formula to classify text I have something like...
$$ P(Cat|Word1) = \frac{P(Word1|Cat) * P(Cat)}{P(Word1)} $$
Using a small example ...
Cat1 = 4 documents
= 1x word 'Hello'
= 3x word 'World'
Cat2 = 10 documents
= 10x word 'Hello'
= 1x word 'World'
Total of 14 docs. with 2 'categories'
I can then calculate the probability of Cat1 and Cat2
$$ P(Cat1|Hello,World) = \frac{P(Hello|Cat1) * P(World|Cat1) * P(Cat1)}{P(Hello) * P(World)} $$
For category 1
$$ P(Cat1|Hello,World) = \frac{\frac{1}{4} * \frac{4}{4} * \frac{4}{14}}{\frac{11}{14} * \frac{4}{14}} \approx 0.31818 $$
And category 2
$$ P(Cat2|Hello,World) = \frac{\frac{10}{10} * \frac{1}{10} * \frac{11}{14}}{\frac{11}{14} * \frac{4}{14}} = 0.35 $$
But I am struggling to interpret the values been returned,
- Does it mean that there is a 31% chance of category 1 and 35% chance of category 2?
- Does it mean that there is a slightly better chance of category 1 vs category 2
- How much more likely is one category over the other?
How can I interpret the actual values been returned?
So Bayes Theorem says: $P(A|B) = \frac{P(B|A)P(A)}{P(B)}$ And the Naive Bayes assumes the class conditional $P(B|A)$ is independent so you can have $P(B|A) = \prod P(b_i|A)$
However, the example you give is not Naive Bayes, because you give the exact data and that data does not seem to satisfy the independent class conditional assumption. $P(Hello,World|Cat1) = 0$ but $P(Hello|Cat1)*P(World|Cat1) = 1/4 * 3/4 = 3/16$, clear the 2 are not the same.
Nevertheless if you just use the Bayes Theorem, you should be able to get to the right probability.
Assume your data is: Cat1 = [(World) $\times$ 3, (Hello)]; Cat2 = [(Hello) $\times$ 9, (Hello, World)]
Then use Bayes Theorem (not naive bayes):
$P(Cat1|Hello,World) = \frac{P(Hello,World|Cat1)*P(Cat1)}{P(Hello,World)} = \frac{0*\frac{4}{14}}{\frac{1}{14}} = 0$
(Since we know the exact distribution, we can get $P(Hello,World|Cat1)=0$ and $P(Hello,World)=\frac{1}{14}$)
$P(Cat2|Hello, World) = \frac{P(Hello,World|Cat2)*P(Cat2)}{P(Hello,World)} = \frac{\frac{1}{10}*\frac{10}{14}}{\frac{1}{14}}=1$
So in this case, yes
there is 0% chance of Cat1 and 100% chance of Cat2
it means it got to be in Cat2
100% more likely
You can interpret the same way in Naive Bayes, but when you use real data to describe the distribution, please check the assumptions.