What is the relationship between the Leibniz integral rule and the dominated convergence theorem?

980 Views Asked by At

The Leibniz integral rule states

$$ {\displaystyle {\frac {d}{dx}}\left(\int _{a}^{b}f(x,t)\,dt\right)=\int _{a}^{b}{\frac {\partial }{\partial x}}f(x,t)\,dt} $$

when the integral bounds are not a function of $x$, i.e. the variable we take the derivative with respect to.

An expectation of a continuous random variable or with respect to a continuous distribution is defined as an integral. More precisely, let $X$ be a continuous r.v. and $p(x)$ be its parametrized (by $\theta$) density, then we have

$$ \mathbb{E}_{p_{\theta}(x)}\left[ f(x) \right] = \int p_{\theta}(x) f(x) dx $$

In certain cases, you need to take the derivative of an expectation with respect to the parameters $\theta$ (e.g. this is common in certain machine learning problems)

$$\frac{d}{d \theta}\mathbb{E}_{p_{\theta}(x)}\left[ f(x) \right]$$

So, some people, in certain papers, seem to apply the Leibniz integral rule to get

\begin{align} \frac{d}{d \theta}\mathbb{E}_{p_{\theta}(x)}\left[ f(x) \right] & \stackrel{?}{=} \mathbb{E}_{p_{\theta}(x)}\left[ \frac{d}{d \theta} f(x) \right] \\ & \stackrel{?}{=} \int \frac{d}{d \theta} \left[ f(x) p_{\theta}(x) \right] dx \\ \end{align}

Some papers that say that we can bring the derivative inside the expectation because of the dominated convergence theorem, which I am not familiar with, so I would like someone to clarify me the relationship between the Leibniz integral rule above and the dominated convergence (specifically, in the context of taking derivatives of expectations, i.e. probability theory and statistics). Is the DCT just the way to prove the Leibniz integral rule? If that's true, can you show that?

Moreover, if you see above, I have $\mathbb{E}_{p_{\theta}(x)}\left[ \frac{d}{d \theta} f(x) \right] \stackrel{?}{=} \int \frac{d}{d \theta} \left[ f(x) p_{\theta}(x) \right] dx$, however, $p_{\theta}(x) \frac{d}{d \theta} f(x) \neq \frac{d}{d \theta} \left[ f(x) p_{\theta}(x) \right]$, so I suspect I've done something wrong or that the DCT and Leibniz integral rule are not applicable to the same contexts, i.e. maybe the Leibniz integral rule is not directly applicable to expectations because they involve random variables and densities?

3

There are 3 best solutions below

9
On

What you want to do is to bring the limit operation inside the integral sign. This cannot be done in general; one classic counterexample is that if

$$f_n(x)=\begin{cases} n & x \in [0,1/n] \\ 0 & \text{otherwise} \end{cases}$$

then $$\lim\limits_{n \to \infty} \int_0^1 f_n(x) dx = 1$$ but $$\int_0^1 \lim\limits_{n \to \infty} f_n(x) dx = 0$$ The Leibniz rule for fixed limits can be justified by the use of the dominated convergence theorem, which is the most widely used method to prove that one can interchange limit and integral. It works fine in the setting of probability theory since it is based on the Lebesgue integral (which is how expectation is usually defined in probability theory).

As for the rest of your question, passing $\frac{d}{d\theta}$ through $\mathbb{E}_{p_\theta}$ is definitely not valid, but $\frac{d}{d\theta} \mathbb{E}_{p_\theta}[f(X)]=\int \frac{d}{d\theta} \left [ f(x) p_\theta(x) \right ] dx$ is correct (assuming $p_\theta$ is the density function of the distribution of $X$).

0
On

First :

$\frac{df(x)}{d\theta}=0$ is zero since $f$ doesn't depends on $\theta$.

So your problem is rewritten like this :

There is two things to distinguish :

1-Dominated convergence theorem

Let a sequence of function $f_n$ : $X \to Y$, converging pointwise to $f$ such that :

  • There exists integrable $g$ such that for almost all $\forall n \in \mathbb{N}$ , $x \in X$ , $|f_n(x)|\leq |g(x)|$

It follows that we can define the integral of $f$ :

$$ \int f \triangleq\lim_\infty \int f_n = \int \lim_\infty f_n = $$


Now the Leibniz derivation rule is another theorem. (known as derivation under the sign) in which we use DCT to show it.


2-Leibniz derivation rule

Taking a function $p:\Theta \times X \to Y $ veryfing the follow conditions :

  • For all x $x\in X$, la fonction $\theta \mapsto p(\theta,x)$ is continous by part over $\Theta$ and integrable over $\Theta$;
  • $p$ admits partial derivative $\frac{\partial p}{\partial x}$ defined on $\Theta\times X$;
  • For all $x\in X$, the fonction $\theta \mapsto \frac{\partial p}{\partial x}(\theta,x)$ is continous by part over $\Theta$;
  • For all $\theta \in \Theta$, the fonction $x\mapsto \frac{\partial p}{\partial x}(\theta,x)$ est continue sur $X$;
  • There exists $g:\Theta\to\mathbb R_+$ continous by part and integrable, such that for all $x\in X$ and for all $\theta\in \Theta$, $$\left|\frac{\partial p}{\partial x}(\theta,x)\right|\leq g(\theta).$$
  • 0
    On

    We wish to prove $$ {\displaystyle {\frac {\partial }{dx}}\left(\int _{a}^{b}f(x,t)\,dt\right)=\int _{a}^{b}{\frac {\partial }{\partial x}}f(x,t)\,dt}. $$ We assume that $f:[c,d]\times T\to \mathbb{R}$ and $f(x, \cdot)$ is integrable for each $X$. Further assume that there exists $g\in L^1$ such that $|\frac{\partial f}{\partial x} (x,t)|\leq g(t)$ for all $x$ and $t$. Let $F(x)=\int_a^b f(x,t) \,dt$.

    To this end put $h_n(t)=\frac{f(x_n, t)-f(x_0, t)}{x_n-x_0}$ where $x_n\to x_0$. Then $ h_n(t)\to \frac{\partial f}{\partial x}(x_0, t) $ as $n\to \infty$. Further $$ |h_n(t)|\leq \sup_{x\in[c, d]}|\frac{\partial f}{\partial x} (x,t)|\leq g(t) $$ Hence the dominated convergence theorem implies that $$ F'(x_0)=\lim_{n\to\infty}\frac{F(x_n)-F(x_0)}{x_n-x_0}=\lim\int_a^b h_n(t)\,d t=\int_a^b \frac {\partial }{\partial x} f(x,t)\, dt $$ where in the last line we use the dominated convergence theorem.