What does it mean when we say a probability of a hypothesis and how to derive the formula?

32 Views Asked by At

I am reading Solomon Kullback's information theory and statistics and find myself puzzled by how on page 17 he derives the conditional probability of a certain hypothesis conditional on the observed variable of the data

$$P\left(H_{i} \mid x\right)=\frac{P\left(H_{i}\right) f_{i}(x)}{P\left(H_{1}\right) f_{1}(x)+P\left(H_{2}\right) f_{2}(x)}[\lambda],$$

where $H_{i}, i=1,2,$ is the hypothesis that $X$ is from the statistical population with probability measure $\mu_i$. $f_i$ is defined previously as the generalized probability density: considering probability space $\left(\mathscr{X}, \mathscr{S}, \mu_{i}\right), i=1,2$, assmes $\mu_1$ and $\mu_2$ are absolutely continuous wrt one another, denoted as $\mu_{1} \equiv \mu_{2}$, let $\lambda$ be another probability measure such that $\lambda \equiv \mu_{1}, \lambda \equiv \mu_{2}$, then by Radon-Nikodym theorem there exists function $f_i(x), i =1,2$ called generalized probability density unique up to set of probability 0 in $\lambda$, measurable $\lambda$, $0<f_{i}(x)<\infty[\lambda], i = 1,2$ such that $$\mu_{i}(E)=\int_{E} f_{i}(x) d \lambda(x), \quad i=1,2$$ for all $E \in \mathscr{S}$. Symbol $[\lambda]$ mean statement is true except for set E such that $\lambda(E)=0$ (I don't know why we need this...)

However if we apply the Baye's theorem:

$$P\left(H_{i} \mid x\right)=\frac{P\left(x \mid H_i\right) P\left(H_{i}\right)}{\sum P\left(x \mid H_{i}\right) P\left(H_{i}\right)},$$ which need further justification to $$=\frac{f_{i}(x) P(H_{i})}{\sum f_{i}(x) P\left(H_{i}\right)}$$

I don't know how this can be achieved... I am also having trouble understanding where $P$ is defined... Since in Kullback's book notation P is only introduced up to this point, I am not sure what space $P$ is defined upon. Because in my understanding we need to work with a measure space $(\Omega, \mathcal{F}, P)$ where $P$ is clearly defined. Here it seems the probability space is no longer the $\left(\mathscr{X}, \mathcal{S}, \mu_{i}\right), i=1,2$ but something else... if I don't define the space clearly I don't know what event $H_1$ and $H_2$ and what kind of set they are so $P(H_1)$ and $P(H_2)$ is well-defined...