confusion with chain rule - derivative of scalar by vector

93 Views Asked by At

I am studying reinforcement learning and came across the following: enter image description here

Here $\delta$ is a function of $\theta$. $\phi$ is independent of $\theta$

I am confused on the third line (annotated with red arrow). Using the following identity:

enter image description here

In my derivation third line becomes:

$$ -\alpha\DeclareMathOperator{\E}{\mathbb{E}} \E[\phi\phi^T]^{-1}\DeclareMathOperator{\E}{\mathbb{E}} \E[\delta\phi](\bigtriangledown_\theta\DeclareMathOperator{\E}{\mathbb{E}} \E[\delta\phi]) $$

What am I missing?