In the article "Barren plateaus in quantum neural network training landscapes", the objective function $E(\theta)$ is defined as $$ E(\theta) =i\langle{0|U(\theta)^\dagger H U(\theta)|0\rangle} $$ and one has $$ \partial_k E = i\langle0|U^\dagger_{\_}[V_k, U^\dagger_{+}HU_{+}]U_{-}|0\rangle $$ where $U_{-} = \prod^{k-1}_{l=o}U_l(\theta_l)W_l$ and $U_{+} = \prod^{L}_{l=k}U_l(\theta_l)W_l$ and with the assumption that $$ p(U) = \int dU_{+}p(U_{+})\int dU_{-}p(U_{-}) \times \delta(U_{+}U_{-} - U). $$ From the assumption, $<\partial_k E> = \int dU p(U)\partial_k <0|U(\theta)^\dagger H U(\theta)|0>$
Here is my question. The final partial derivatives are shown as $$ <\partial_kE> = i\int{dU_{-}p(U_{-})\,\mathrm{Tr}\{\rho_{-}\times \int{dU_{+}p(U_{+})[V, U^{\dagger}_{+}HU_{+} ]\} }} $$ In the last expression, I want to know why Trace function appears in the last equation!
I suppose that the trace appears because they defined $\rho_- = U_{\_}|0\rangle \langle 0| \,U^\dagger_{\_}$, that is the projection on $U_{\_}|0\rangle$. Then from that and your equation on $\partial_k E$, it holds $$ \partial_k E = i \,\mathrm{Tr}(\rho_-\,[V_k, U^\dagger_{+}HU_{+}]). $$ Now taking the expectation with respect to the law $p$ and then using your formula for $p(U)$ $$ <\partial_k E> = i \int p(U)\,\mathrm{Tr}(\rho_-\,[V_k, U^\dagger_{+}HU_{+}])\,\mathrm dU \\ = i \int p(U_+)\,\mathrm{Tr}\!\left(\rho_- \int p(U_-)\,[V_k, U^\dagger_{+}HU_{+}]\,\mathrm dU_-\right) \mathrm dU_+. $$ which is your formula, except that there is a $V_k$ instead of $V$.