Doubt in deducing probability distribution from given data

28 Views Asked by At

I am solving some questions from MIT OCW introduction to probability and statistics (spring 2014) :- https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/exams/MIT18_05S14_Prac_Exam2.pdf

On page-3 question-14(a), there's this question:-

Question-14) Peter and Jerry disagree over whether 18.05 students prefer Bayesian or frequentist statistics. They decide to pick a random sample of 10 students from the class and get Shelby to ask each student which they prefer. They agree to start with a prior f(θ) ∼ beta(2, 2), where θ is the percent that prefer Bayesian.

Let $x_1$ be the number of people in the sample who prefer Bayesian statistics. What is the pmf of $x_1$?

I have two arguments to deduce the pmf of $x_1$ :-

$\bf Argument-1 $

Let Bin(n,p) denote binomial distribution with number of experiments=n and probability of success in each event=p.

There are 10 students in class. We ask each student whether they are frequentist or bayesian. Let being bayesian be success. Then Each event can result in either success or failure. So each event is bernoulli distribution. Hence $x_1$ follows binomial distribution with n=10. p is sampled from prior beta(2,2), hence distribution of p is beta(2,2)~f($\theta$).

So, $x_1$ ~ Bin(10,$\theta$).

$\bf Argument-2$

Given that $\theta$ is "percent that prefers bayesian". So, $\frac{x_1 * 100}{10} = \theta$. So, $x_1=\frac{\theta}{10}$. $$Since, f(\theta)=Beta(2,2)=\frac{\Gamma(2+2)}{\Gamma(2)\Gamma(2)} \theta ^{2-1} (1-\theta)^{2-1}$$

Replacing $\theta$ with $\theta/10$, we get $$f(\theta/10)=\frac{\Gamma(2+2)}{\Gamma(2)\Gamma(2)} (\theta/10) ^{2-1} (1-\theta/10)^{2-1}$$.

Since, $theta/10$ is $x_1$, hence $x_1$ ~ Beta(2,2) $$ $$

Please tell me which argument is correct.

1

There are 1 best solutions below

0
On

You immediately exclude Argument 2 as plausible because $x_1$, being the number of students in the sample of $10$, must be an integer in the set $\{0, 1, \ldots, 10\}$. That you could even say $x_1 \sim \operatorname{Beta}$ is a completely obvious mistake that no one should ever make. I did not even read any part of that argument except the conclusion and without even looking at your math, I am certain it is wrong.

Never forget that probability distributions that model some random process must be consistent with the outcomes that can be observed from that process.

Now that we have cleared up this misunderstanding, it is worth considering your Argument 1 in more detail.

First, the conditional distribution of $x_1$ given $\theta$ is obviously binomial; specifically, $$x_1 \mid \theta \sim \operatorname{Binomial}(n = 10, p = \theta).$$ But that is not what the question asked for; consequently, what you wrote is incomplete. The question, as I understand it, is asking for the marginal or unconditional distribution of $x_1$, namely

$$\Pr[x_1 = x] = \int_{\theta = 0}^1 \Pr[x_1 = x \mid p = \theta]f(\theta) \, d\theta,$$ where $\theta \sim \operatorname{Beta}(2,2)$. This is simply $$\Pr[x_1 = x] = \int_{\theta = 0}^1 \binom{10}{x} \theta^x (1-\theta)^{10-x} \frac{\Gamma(4)}{\Gamma(2)^2} \theta (1-\theta) \, d\theta.$$ Complete the integration and this gives you the PMF of $x_1$. Notice that the integral is with respect to $\theta$ and the result is a function on the support of $x_1 \in \{0, 1, \ldots, 10\}$.