Bayesian probability on Bernoulli distribution

195 Views Asked by At

Let $D$ be a Bernoulli distribution with $P[X=1] = \theta$ (and so $P[X=0]=1-\theta$). Let $\chi = \{0,1\}$ be an iid sample drawn from $D$. Assume a prior distribution on $\theta$, with $\theta$ uniformly distributed between 0 and .25.

What is the value of $p(\theta)$ for $\theta=\frac{1}{8}$?What is the value of $p(\theta \vert \chi)$ for $\theta=\frac{1}{8}$?

I'm confused by what the question is asking for, and how everything ties together. There are other parts, but I think if I can grasp what's happening here I will be able to figure out the rest. I think I am supposed to be finding the probability that the random variable $\theta$ takes on the value of $\frac{1}{8}$ given that it is uniformly distributed over the interval [0,$\frac{1}{4}$], but isn't this probability 0 because the probability of choosing any given point in an interval is 0?

I know I must be thinking about this incorrectly because I should use Bayes' rule for the second part, and $p(\theta)$ should be interpreted as the prior probability of $\theta$, which definitely should not be 0.

This is a homework question, so I'm not looking for an explicit answer, but any hints would be very appreciated.

2

There are 2 best solutions below

1
On BEST ANSWER

I'd say it's natural that you're confused.

What is the value of $p(\theta)$ for $\theta=\frac{1}{8}$?

is slightly confusing. First, as you rightly noted, $\theta$ is a continuous random variable, so $p(\theta)$ is actually a density function.

Then, let's guess that "value of $p(\theta)$ for $\theta=\frac{1}{8}$" simply amounts to evaluate $p(\theta)$ at that value [*]. In that case, the value you'd get -let's call it $p_\theta( \frac{1}{8})$- is not a probability, it's just the value of a probability density.

True, the probability that $\theta$ takes that particular value is zero. But, it doesn't matter. What matters is that the probability that $\theta$ takes a value in a interval of length $h$ around that value is $p_\theta( \frac{1}{8}) h$ for small $h$. Because of this, it makes sense to compare this with the a posteriory value.

[*] Presumably, it's stated in that convoluted way because it would be even more confusing to write $p(\frac{1}{8})$ - this confusion is a consequence of the common abuse of notation of writing $p(x)$ and $p(y)$ to mean different density functions, we should write $p_X(x)$ , $p_Y(y)$ etc

3
On

I think they just want the probability density. So in this case $p(\theta=\frac 18)=4$ because you are spreading the total amount of probaility, $1$, uniformly over an interval with length $\frac 14$. (Visualise a rectangle with base $\frac 14$, area $1$, and hence height $4$.)

For the second bit they again want the probability density, now for the cases where you condition on $\chi=0$ and $\chi=1$. I'll leave you to try to work this out with Bayes' rule.