Find maximum likelihood estimator of joint probability distribution of Bernoulli variables (for use in mutual information/feature selection)

14 Views Asked by Bumbble Comm At 09 Apr 2026 - 2:38

When performing feature selection by finding mutual information estimates for class C and feature U (both binary), we need to estimate joint probabilities like P(C=1, U=1). This site claims that the maximum likelihood estimate of this probability is

(# of documents where C=1 and U=1) / (# total documents)

Why is this the maximum likelihood estimate? Is this because we assume variable C and variable U are both Bernoulli, and thus the joint distribution of C and U is categorical? Or are we independently assuming C and U are Bernoulli and C-U together are categorical? I know that the MLE for a categorical distribution is (# events in category k / total events).

Original Q&A

Find maximum likelihood estimator of joint probability distribution of Bernoulli variables (for use in mutual information/feature selection)

Related Questions in MACHINE-LEARNING

Related Questions in MAXIMUM-LIKELIHOOD

Related Questions in MUTUAL-INFORMATION

Trending Questions

Popular # Hahtags

Popular Questions