I understand EP by reading Minka's thesis:
http://research.microsoft.com/en-us/um/people/minka/papers/ep/minka-ep-uai.pdf
I'm trying to apply it to solve a Bayesian inference problem. However, I'm confused by it's way of propagating the expectation into the approximated distribution.
Although it states $t_0(\theta) = p(\theta)$ in the formulation, the process seems doing no computation that involves this term. We just start from a initial Gaussian, replace one of the factors by a real likelihood and match the moment. So over the whole process, we only play with the $q$, which is the approximation and the real likelihoods $p(x_i|\theta)$.
I think my understanding is somehow very wrong, could anyone help me on this? Which process in EP takes the prior effects into account? I've been struggling with this for many days...thanks!
The prior is taken into account at the initial Gaussian. After that, the prior is never touched. The algorithm in section 3.1 uses the prior only in step 2, and similarly in section 5.