Let $Y$ have a Bernoulli distribution with P(Y=0)=0.2 and $X$ have a Bernoulli distribution with:
$$f(X|Y) = \begin{cases} 0.7 & \quad X=Y\\ 0.3 & \quad \text{ else}\\ \end{cases} $$
find the naive Bayes classifier of $Y$ based on two samples of $X$.
I want to find $\hat y=\arg \max_{i=1,\cdots,N} P(y=w_i|x)$ and I know I need to calculate it by: $$P(Y|X)=\frac{P(X|Y)P(Y)}{P(X)}$$ Here is what I tried: $$P(Y=1|X)=\frac{P(X|Y=1)P(1)}{P(X)}=\frac{P(X=?|Y=1)0.8}{P(X)=??}$$
Do I have to calculate all 4 $P(Y=1|X=1),P(Y=1|X=0),P(Y=0|X=1),P(Y=0|X=0)$?
How do I figure out $\hat y=\arg \max_{i=1,\cdots,N} P(y=w_i|x)$ from these? Is this the max of $P(Y=1|X=1),P(Y=1|X=0),P(Y=0|X=1),P(Y=0|X=0)$?
Is there a good reference or example in the subject?
for the case of one observation:
$$\hat y=\arg \max_{\omega=\Omega}\{P(Y|X)\}\doteq \arg \max_{\omega=0,1}\{P(X|Y=0)P(Y=0),P(X|Y=1)P(Y=1)\}= \arg \max_{\omega=0,1}\{P(X|Y=0)P(Y=0),P(X|Y=1)P(Y=1)\}$$ Let $m$ be the number of times x=1 hence: $$\hat y=\arg \max_{\omega=0,1}\{P(Y=0)P(X=1|Y=0)^mP(X=0|Y=0)^{1-m},P(Y=1)P(X=1|Y=1)^mP(X=0|Y=1)^{1-m}\}=\arg \max_{\omega=0,1}\{0.2\cdot 0.3^m0.7^{1-m},0.8\cdot 0.7^m0.3 ^{1-m}\}=\begin{cases} \arg \max_{\omega=0,1}\{0.2\cdot 0.3,0.8\cdot 0.7\} & \quad m=1\\ \arg \max_{\omega=0,1}\{0.2\cdot 0.7,0.8\cdot 0.3\} & \quad m=0\\ \end{cases}$$ $$\hat y=\begin{cases} \arg \max_{\omega=0,1}\{0.06,0.56\} & \quad m=1\\ \arg \max_{\omega=0,1}\{0.14,0.24\} & \quad m=0\\ \end{cases}=\begin{cases} 1 & \quad m=1\\ 1 & \quad m=0\\ \end{cases}$$ hence: $\hat y=1$ always with the error: $$P_e(\hat y)=P(\hat y\neq y)=P(\hat y=0|y=1)P(y=1)+P(\hat y=1|y=0)P(y=0)$$ since $\hat y=1$ always: $$P_e(\hat y)=P(\hat y=1|y=0)P(y=0)=1\cdot 0.2=0.2$$
for the case of two variables:
$$\hat y=\arg \max_{\omega=\Omega}\{P(Y|X)\}\doteq \arg \max_{\omega=0,1}\{\prod_{k=1}^2P( X_k|Y=0)P(Y=0),\prod_{k=1}^2P( X_k|Y=1)P(Y=1)\}= \arg \max_{\omega=0,1}\{\prod_{k=1}^2 P(X_k|Y=0)P(Y=0),\prod_{k=1}^2 P(X_k|Y=1)P(Y=1)\}$$ Let $m$ be the number of times x_k=1 hence: $$\hat y=\arg \max_{\omega=0,1}\{P(Y=0)P(X_k=1|Y=0)^mP(X_k=0|Y=0)^{2-m},P(Y=1)P(X_k=1|Y=1)^mP(X_k=0|Y=1)^{2-m}\}=\arg \max_{\omega=0,1}\{0.2\cdot 0.3^m0.7^{2-m},0.8\cdot 0.7^m0.3 ^{2-m}\}=\begin{cases} \arg \max_{\omega=0,1}\{0.2\cdot 0.7^2,0.8\cdot 0.3^2\} & \quad m=0\\ \arg \max_{\omega=0,1}\{0.2\cdot 0.7\cdot 0.3,0.8\cdot 0.7\cdot 0.3\} & \quad m=1\\ \arg \max_{\omega=0,1}\{0.2\cdot 0.3^2,0.8\cdot 0.7^2\} & \quad m=2\\ \end{cases}$$ $$\hat y=\begin{cases} \arg \max_{\omega=0,1}\{0.098,0.072\} & \quad m=0\\ \arg \max_{\omega=0,1}\{0.042,0.168\} & \quad m=1\\ \arg \max_{\omega=0,1}\{0.018,0.392\} & \quad m=2\\ \end{cases}=\begin{cases} 0 & \quad m=0\\ 1 & \quad m=1\\ 1 & \quad m=2\\ \end{cases}$$ hence: $\hat y=\begin{cases} 0 & \quad m=0\\ 1 & else\\ \end{cases}$ with the error: $$P_e(\hat y)=P(\hat y\neq y)=P(\hat y=0|y=1)P(y=1)+P(\hat y=1|y=0)P(y=0)$$ $$P_e(\hat y)=P(x_1=x_2=0|y=1)P(y=1)+P(x_1\neq 0 and x_2\neq 0|y=0)P(y=0)=P(x_1=x_2=0|y=1)P(y=1)+(1-P(x_1=x_2=0|y=0))|y=0)P(y=0)$$ $$P_e(\hat y)=0.8\cdot 0.3^2+(1-0.7^2)0.2=0.174$$