Let $D$ be a Bernoulli distribution with $P[X=1] = \theta$ (and so $P[X=0]=1-\theta$). Let $\chi = \{0,1\}$ be an iid sample drawn from $D$. Assume a prior distribution on $\theta$, with $\theta$ uniformly distributed between 0 and .25.
What is the value of $p(\theta)$ for $\theta=\frac{1}{8}$?What is the value of $p(\theta \vert \chi)$ for $\theta=\frac{1}{8}$?
I'm confused by what the question is asking for, and how everything ties together. There are other parts, but I think if I can grasp what's happening here I will be able to figure out the rest. I think I am supposed to be finding the probability that the random variable $\theta$ takes on the value of $\frac{1}{8}$ given that it is uniformly distributed over the interval [0,$\frac{1}{4}$], but isn't this probability 0 because the probability of choosing any given point in an interval is 0?
I know I must be thinking about this incorrectly because I should use Bayes' rule for the second part, and $p(\theta)$ should be interpreted as the prior probability of $\theta$, which definitely should not be 0.
This is a homework question, so I'm not looking for an explicit answer, but any hints would be very appreciated.
I'd say it's natural that you're confused.
is slightly confusing. First, as you rightly noted, $\theta$ is a continuous random variable, so $p(\theta)$ is actually a density function.
Then, let's guess that "value of $p(\theta)$ for $\theta=\frac{1}{8}$" simply amounts to evaluate $p(\theta)$ at that value [*]. In that case, the value you'd get -let's call it $p_\theta( \frac{1}{8})$- is not a probability, it's just the value of a probability density.
True, the probability that $\theta$ takes that particular value is zero. But, it doesn't matter. What matters is that the probability that $\theta$ takes a value in a interval of length $h$ around that value is $p_\theta( \frac{1}{8}) h$ for small $h$. Because of this, it makes sense to compare this with the a posteriory value.
[*] Presumably, it's stated in that convoluted way because it would be even more confusing to write $p(\frac{1}{8})$ - this confusion is a consequence of the common abuse of notation of writing $p(x)$ and $p(y)$ to mean different density functions, we should write $p_X(x)$ , $p_Y(y)$ etc