Let us define a parametric statistical model as a triplet consisting of a probability space $( \Omega, \mathcal{F}, P )$, a parameter space $\Theta$ (open subset of $\mathbb{R}^d$), and a measurable map
$$ X: \Omega \times \Theta \rightarrow \mathcal{X} \subset \mathbb{R}^n \\ \quad \quad(\omega, \theta) \rightarrow X( \omega, \theta) $$.
where in $\Theta$ we consider its Borelian sigma-field. Let us define a statistic $T: \mathcal{X} \rightarrow \mathbb{R}^m $ as a measurable map.
Suppose also that the the family of random variables ${X(·, θ), θ ∈ Θ}$ is
also regular in the following sense:
- $X$ has a density $p(·; \theta) \in C^1$ for all $\theta \in \Theta$ with support, $supp(X)$, independent of $\theta$.
- $X(\omega, ·) \in C^1$ as a function of $\theta, \forall \omega \in \Omega$. Furthermore, $\partial_{\theta} X_j \in L^2(\Omega)$ and $E(\partial_{\theta} X_j |X = x) \in C^1$ as a function of $x, \forall{\theta} \in \Theta$ and $j = 1, \dots, n$.
Then I find written that
$$ E\left( \sum_{j=1}^n \partial_{x_j} T(X) \partial_{\theta} X_j \right) = \int_{\mathbb{R}^n} \sum_{j=1}^n \partial_{x_j} T(x) E[ \partial_{\theta} X_j | X = x ] p(x; \theta) dx $$
How does one prove this equality?
Also, say that $\partial_{\theta} X_j$ was a function of $X$ would there be anything wrong in writing $ E\left( \sum_{j=1}^n \partial_{x_j} T(X) \partial_{\theta} X_j \right) = \int_{\mathbb{R}^n} \sum_{j=1}^n \partial_{x_j} T(x) \partial_{\theta} x_j(\theta) p(x; \theta) dx $ ?
I am a bit lost on the need for the conditional expectation.
EDIT: An example for $X$ could be $X = (X_1, X_2, \dots, X_n)$ where $X_j = U_j / \theta$ are iid random variables with $U_j \sim \exp(1)$.
Using the tower law, $$ \begin{align*} E\left( \sum_{j=1}^n \partial_{x_j} T(X) \partial_{\theta} X_j \right) &= E\left( E\left( \sum_{j=1}^n \partial_{x_j} T(X) \partial_{\theta} X_j \mid X \right)\right)\\ &= E\left( \sum_{j=1}^n \partial_{x_j} T(X) E\left( \partial_{\theta} X_j \mid X \right)\right)\\ &=\int_{\mathbb{R}^n} \sum_{j=1}^n \partial_{x_j} T(x) E[ \partial_{\theta} X_j | X = x ] p(x; \theta) dx \end{align*}$$
As you say, if $\partial_\theta X_j$ is a function of $X$ and $\theta$, then the conditional expectation isn't required. I think in normal usage, that would be true, but it isn't actually implied by the assumptions; for example, suppose $\Omega = \mathbb R^{2n}$, and $X_j = \omega_j + \theta \omega_{n+j}$. Then $\partial_\theta X_j = \omega_{n+j}$ is not a function of $X$ and $\theta$.