I am finding little difficulty in the following definition of total probability specified in a NLP related paper.
Say $q^i$ is a partition of my continuous sample space. The authors have defined the following probability by using total law of probability,
$p(x|\alpha)=\int p(x,q^i|\alpha)dq^i$.
But as per I know the following has to happen via law of total probability,
$p(x|\alpha)=\int p(x|q^i, \alpha)dq^i$
Would be nice if someone throws some light on this.
Best,
The authors seem to be right: let $f_{X\mid A}(\ \mid a)$ and $f_{X,Q\mid A}(\ ,\ \mid a)$ denote the conditional PDFs of $X$ and $(X,Q)$ conditionally on $A$ then, for every $x$ and $a$, $$f_{X\mid A}(x\mid a)=\int f_{X,Q\mid A}(x,q\mid a)\,\mathrm dq.$$ This is strictly the equivalent of the marginalization property of (unconditional) PDFs, which reads $$f_{X}(x)=\int f_{X,Q}(x,q)\,\mathrm dq.$$ You might want to indicate a source for the alternative identity you are suggesting.