I am learning policy gradient from slides of Stanford CS231 reinforcement learning
\begin{align} \tau &= (s_0, a_0, r_0, s_1, a_1, r_1, ...) \\ J(\theta)&=\mathbb{E}_\tau [r(\tau)] \\ &=\int_\tau r(\tau) p(\tau;\theta)d\tau \\ \nabla_\theta J(\theta) &= \int_\tau r(\tau)\nabla_\theta p(\tau;\theta)d\tau \end{align}
Can anyone tell me why the last integral is intractable?
It is not the integral itself not being mathematically solvable or something, you have to understand it w.r.t. context (i.e. Monte Carlo Simulation).
If i'm recalling it right, this is the math basis of Reinforce Algorithm, in which case you want to maximize the the object function through gradient update with Monte Carlo Simulation -- however, here comes the problem, in the last integral, there is no p term readily available s.t. you can write it into expectation form, so the missing piece here is only the p term, but you can do some math and extract such p term in the integral and transform it into some expectation w.r.t. policy distribution.
After you have the nice expectation form of gradient for update in Monte Carlo setting, you are good to go.