My mysterious equation is:
$$p(x|\chi)=\int_{\theta\in\Theta}p(x|\theta)p(\theta|\chi)d\theta$$
where $\chi$ is some samples drawn from sample space parameterized by $\theta\in\Theta$. Follows the bayesian tradition, we are going to estimate the newly seen sample $x$ given the old samples $\chi$.
In my opinion, I think the equation should be written as:
$$p(x|\chi)=\int_{\theta\in\Theta}p(x|\theta,\chi)p(\theta|\chi)d\theta$$
Cause this obeys the calculation of conditioning:
$$p(x,\theta|\chi)=p(x|\theta,\chi)p(\theta|\chi)$$
and then, the marginalization of $\theta$ leads to $p(x|\chi)$.
One explanation that I found hitting the point is all the samples drawn from a generative process parameterized by $\theta$ are seen as i.i.d. So the conditioning on \chi (the previous samples) could simply be removed since $x\bot\chi|\Theta$. This shows:$$p(x|\theta,\chi)=p(x|\theta)$$ Independence means: $p(x|\theta,\chi)=\frac{p(x,\chi|\theta)}{p(\chi|\theta)}=\frac{p(x|\theta)p(\chi|\theta)}{p(\chi|\theta)}=p(x|\theta)$.
I am not sure whether my derivation and argument are reasonable. Or could you tell me more about this kind of statistical inference? Thanks a lot :)