In the TRPO paper, the authors say
we first prove that minimizing a certain surrogate objective function guarantees policy improvement with non-trivial step sizes.
What is a non-trivial step size? 1e-2 , 1e-3?
TRPO paper : arxiv.org/pdf/1502.05477.pdf