Bayesian fallacy in Binomial example

57 Views Asked by At

There are $50$ students in a course. Suppose that each student independently decides to continue the course or drop out of the course randomly. Let $X$ be the total number of students who continue with the course. Each student continues with a constant probability $p$, which is drawn randomly with $P(p=0)=0.1, P(p=0.75) = 0.6, P(p=1) = 0.3$ prior to the decisions being taken.

Find $E(X), Var(X)$. Also, comment on the distribution of $X$.

Now I am having a logical error somewhere in my two approaches :

Approach 1

We know that $X|P=p \sim Bin(50,p)$. So, $E(X) = E(X|P=0)P(P=0) + E(X|P=0.75)P(P=0.75) + E(X|P=1)P(P=1) = 22.5 + 15 = 37.5$.

Now, $V(X) = V(E(X|P))+E(V(X|P)) = V(50P)+E(50P(1-P)) = 50^2 V(P) + 50 E(P-P^2) = K$ which is some value.

Approach 2

Let us take $X_i = 1$ with probability $p$ and $0$ otherwise. We try to write $X=X_1+...+X_{50}$

Then $E(X_i |p) = E(E(X_i|P)) = 0.75$ and so, $E(X) = 50*0.75 = 37.5$.

$V(X_i) = V(E(X_i |P)) + E(V(X_i |P)) = V(P)+E(P-P^2) = E(P)-E(P)^2$, so $V(X) = 50*V(X_i) \neq K$.

Why are the two variances from the two approaches unequal? Is it because in approach 2, $X_i |P$ are not independent and thus we cannot just simply add their variances?

1

There are 1 best solutions below

0
On

You have to make assumptions that you believe match reality (these is a subjective belief and all statisticians have to do it). Let us say then that there is a value $P$ that is random with the distribution as stated. Once $P$ is "sampled", the 50 students independently of each other will then decide to drop of the course with the value of $P$ that was sampled. Under these circumstances, given $P = p,$ the random variable $Y = \sum\limits_{i = 1}^{50} \xi_i$ is $\mathsf{Bin}(50, p).$ Then, $$ \mathbf{E}(Y \mid P = p) = 50p, \quad \mathbf{V}(Y \mid P = p) = 50p(1-p). $$ We know that $$ \mathbf{E}(Y) = \mathbf{E}(\mathbf{E}(Y \mid P)), \quad \mathbf{V}(Y) = \mathbf{E}(\mathbf{V}(Y \mid P)) + \mathbf{V}(\mathbf{E}(Y \mid P)). $$ Therefore, $$ \mathbf{E}(Y) = \mathbf{E}(50P) = 50 \mathbf{E} (P) $$ and $$ \begin{align*} \mathbf{V}(Y) &= \mathbf{E}(50P(1-P)) + \mathbf{V}(50P) \\ &= 2500 \Big\{ \mathbf{E}(P) - \mathbf{E}(P^2) + \mathbf{E}(P^2) - \mathbf{E}(P)^2 \Big\} \\ &= 2500 \Big\{ \mathbf{E}(P) - \mathbf{E}(P)^2 \Big\}. \end{align*} $$ It easily follows that $\mathbf{E}(P) = \dfrac{3}{4},$ so that $\mathbf{E}(Y) = \dfrac{75}{2}$ and $\mathbf{V}(Y) = \dfrac{1875}{4}.$