Bayes Classifier Needed Number of Parameters

1.1k Views Asked by At

I am trying to understand Bayesian Classifiers but I have the following question:

Let's assume we have a vector $X=(X_1,..X_n)$ and $Y$ where each $X_i$ and $Y$ have boolean values. We are trying to figure out $P(Y|X)=\frac{P(X|Y)P(Y)}{P(X)}$. The lecture notes that I am reading said that to estimate $P(X_1,...,X_n|Y)$, we need to estimate $2(2^n-1)$ parameters. What does it mean and how do we get this? For each $X_i$, there are $2^n$ possible values and there are 2 possible values for $Y$. How do we get $2(2^n-1)$? I mean for each $Y$ value, we need to consider $2^n$ parameters, which give us a total of $2^{n+1}$ parameters to consider, why do we subtract 2?

1

There are 1 best solutions below

0
On BEST ANSWER

Here we have $X_i, Y$ binary variables.

  • First consider $Y=0$, we need to have $P(X|Y=0)$ for $2^n$ different values of $X=X_1,X_2,\ldots,X_n$.

  • But also we know $\sum\limits_{X_i\in\{0,1\}}P(X_1,X_2,\ldots,X_n|Y=0)=1$, being a probability mass function.

  • This reduces the number of parameters to estimate by 1, as shown below.

  • If we estimate the probability values for all but one boolean value assignments for $X_i$s, e.g., let's consider estimating probabilities for all assignments except a specific one, namely, $V=\{X_1:b_1,\ldots,X_n:b_n\}$, i.e, estimating for a total of $2^n-1$ value assignments, and let's define $p=\sum\limits_{\{0,1\}^n - V}P(X_1,X_2,\ldots,X_n|Y=0)$, i.e., sum of all those $2^n-1$ probabilities estimated, then the value of $P(X=V|Y=0)$ is no longer needed to be estimated, it can be directly computed as $P(X=V|Y=0)=1-p$.

  • Hence, it's sufficient to estimate $2^n-1$ parameters for this case $Y=0$.

  • Similarly, we need to estimate $2^n-1$ parameter values corresponding to the case $Y=1$, i.e., for $P(X|Y=1)$.

  • Hence, total number of parameters to estimate = $2(2^n-1)=2^{n+1}-2$