How to derive the policy gradient for finite-difference methods used for policy search?

96 Views Asked by Bumbble Comm At 30 Mar 2026 - 3:01

According to https://spiral.imperial.ac.uk/bitstream/10044/1/12051/4/2300000021-Deisenroth-Vol2-ROB-021_published.pdf,

Firstly, I cannot see how $\nabla J_\theta$ can be derived from the perturbations. Using Taylor Series, I can only see that the term given at the end is $\nabla R$. Is it that $J_\theta = \nabla R$? If so, why? If not, how can I derive this result?

Original Q&A

How to derive the policy gradient for finite-difference methods used for policy search?

Related Questions in TAYLOR-EXPANSION

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions