General Convergence of Exponential of Function to Dirac Delta

533 Views Asked by At

Let $f:\mathcal{X}\rightarrow \mathcal{F}$ be a function with a unique, positive maximum at $x^*=\arg\sup_{x}f(x)$ where $\mathcal{X}$ and $\mathcal{F}$ are both bounded. Let $f$ be locally smooth about $x^*$, that is $\exists\ \Delta>0\ s.t. f(x)\in\mathbb{C}^2\ \forall\ x\in\{x\vert \lVert x-x^*\rVert<\Delta$ }. Is it possible to prove that the distribution $p_\epsilon(x)=z_\epsilon\exp\left(\frac{f(x)}{\epsilon}\right)$, where $z_\epsilon=\left(\int\exp\left(\frac{f(x)}{\epsilon}\right)dx\right)^{-1}$ is the normalisation constant, converges to a Dirac delta distribution in the limit $\epsilon\rightarrow 0$? I've been trying to formally show that

$$a)\quad\lim_{\epsilon\rightarrow0}\int z_\epsilon\exp\left(\frac{f(x)}{\epsilon}\right)\varphi(x)dx=\varphi(x^*)$$

for any continuous $\varphi(x)$, but I have limited background in analysis. Moreover, if this result holds, can it be extended to any joint $p_\epsilon(x)=z_\epsilon\exp\left(\frac{f(x)}{\epsilon}\right)q(x)$ with $z_\epsilon=\left(\int\exp\left(\frac{f(x)}{\epsilon}\right)q(x)dx\right)^{-1}$ where $q(x)$ is another distribution, that is can we prove:

$$b)\quad\lim_{\epsilon\rightarrow0}\int z_\epsilon\exp\left(\frac{f(x)}{\epsilon}\right)q(x)\varphi(x)dx=\varphi(x^*),$$

again for any $\varphi(x)$ is continuous?

For those interested, the context of this comes from reinforcement learning where we seek a policy $\pi(a\vert s)$ that is greedy with respect to a value function $Q(a,s)$, that is $\pi(a\vert s)=\delta(a\in\arg\max_{a'}Q(a',s))$.

Many thanks in advance.

1

There are 1 best solutions below

0
On

I think I've come up with a proof for a). Sorry if it is bit slow, computer scientists like things step-by-step!

Firstly, we define the auxiliary function to be
\begin{align} g(x):= f(x)-f(x^*). \end{align}

Note, $g(x)\le 0$ with equality at $g(x^*)=0$. Substituting $f(x)=g(x)+f(x^*)$ into $p_\varepsilon(x)$: \begin{align} p_\varepsilon(x)=&\frac{\exp\left(\frac{g(x)+f(x^*)}{\varepsilon}\right)}{\int_\mathcal{X}\exp\left(\frac{g(x)+f(x^*)}{\varepsilon}\right)dx},\\ =&\frac{\exp\left(\frac{g(x)}{\varepsilon}\right)\exp\left(\frac{f(x^*)}{\varepsilon}\right)}{\int_\mathcal{X}\exp\left(\frac{g(x)}{\varepsilon}\right)\exp\left(\frac{f(x^*)}{\varepsilon}\right)dx},\\ =&\frac{\exp\left(\frac{g(x)}{\varepsilon}\right)}{\int_\mathcal{X}\exp\left(\frac{g(x)}{\varepsilon}\right)dx}.\quad (1) \end{align} Now, substituting (1) into the limit in a) yields: \begin{align} \lim_{\varepsilon\rightarrow0}\int_{\mathcal{X}} \varphi(x)p_\varepsilon(x)dx=\lim_{\varepsilon\rightarrow0}\left(\int_{\mathcal{X}} \varphi(x)\frac{\exp\left(\frac{g(x)}{\varepsilon}\right)}{\int_\mathcal{X}\exp\left(\frac{g(x)}{\varepsilon}\right)dx}dx\right).\label{eq:limit_g} \end{align} Using the substitution $u:=\frac{(x^*-x)}{\sqrt{\varepsilon}}$ to transform the integrals in (1), we obtain \begin{align} \lim_{\varepsilon\rightarrow0}\int_{\mathcal{X}} \varphi(x)p_\varepsilon(x)dx&=\lim_{\varepsilon\rightarrow0}\left(\int_{\mathcal{U}} \varphi(x^*-\sqrt{\varepsilon}u)\frac{\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)}{\int_\mathcal{U}\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\sqrt{\varepsilon}du}\sqrt{\varepsilon}du\right),\\ &=\lim_{\varepsilon\rightarrow0}\left(\frac{\int_{\mathcal{U}} \varphi(x^*-\sqrt{\varepsilon}u)\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)du}{\int_\mathcal{U}\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)du}\right).\quad (2) \end{align} We now find $\lim_{\varepsilon\rightarrow0}\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)$: Using $\mathrm{L'H\hat{o}pital's}$ rule to the second derivative with respect to $\sqrt{\epsilon}$, we have: \begin{align} \lim_{\varepsilon\rightarrow0}\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)&=\lim_{\varepsilon\rightarrow0}\left(\frac{\partial_{\sqrt{\varepsilon}}g(x^*-\sqrt{\varepsilon}u)}{\partial_{\sqrt{\varepsilon}}\varepsilon}\right),\\ &=\lim_{\varepsilon\rightarrow0}\left(\frac{\partial_{\sqrt{\varepsilon}}f(x^*-\sqrt{\varepsilon}u)}{\partial_{\sqrt{\varepsilon}}\varepsilon}\right),\\ &=\lim_{\varepsilon\rightarrow0}\left(\frac{-u^\top\nabla f(x^*-\sqrt{\varepsilon}u)}{2\sqrt{\varepsilon}}\right),\\ &=\lim_{\varepsilon\rightarrow0}\left(\frac{\partial_{\sqrt{\varepsilon}}\left(u^\top\nabla f(x^*-\sqrt{\varepsilon}u)\right)}{\partial_{\sqrt{\varepsilon}}(2\sqrt{\varepsilon})}\right),\\ &=\lim_{\varepsilon\rightarrow0}\left(\frac{u^\top\nabla^2 f(x^*-\sqrt{\varepsilon}u)u}{2}\right),\\ &=\frac{u^\top\nabla^2 f(x^*)u}{2}. \end{align} The integrand in the numerator of (2) therefore converges pointwise to $\varphi(x^*)\exp\left(\frac{u^\top\nabla^2 f(x^*)u}{2}\right)$, that is \begin{align} \lim_{\varepsilon\rightarrow0}\left(\varphi(x^*-\sqrt{\varepsilon}u)\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right)=\varphi(x^*)\exp\left(\frac{u^\top\nabla^2 f(x^*)u}{2}\right),\quad (3) \end{align} and the integrand in the denominator converges pointwise to $\exp\left(\frac{u^\top\nabla^2 f(x^*)u}{2}\right)$, that is \begin{align} \lim_{\varepsilon\rightarrow0}\left(\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right)=\exp\left(\frac{u^\top\nabla^2 f(x^*)u}{2}\right).\quad (4) \end{align} From the second order sufficient conditions for $f(x^*)$ to be a maximum, we have $u^\top\nabla^2 f(x^*)u \le 0$ $\forall\ u\in\mathcal{U}$ with equality only when $u=0$. This implies that (3) and (4) are both bounded functions.

By definition, we have $g(x^*-\sqrt{\epsilon} u)\le0\ \forall\ u\in\mathcal{U}$, which implies that $|\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)|\le1$. Consequently, the integrand in the numerator of (2) is dominated by $\lVert \varphi(\cdot)\rVert_\infty$, that is \begin{align} \left\lvert\varphi(x^*-\sqrt{\varepsilon}u)\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right\rvert\le\lVert \varphi(\cdot)\rVert_\infty,\quad (5) \end{align} and the integrand in the denominator is dominated by $1$, that is \begin{align} \left\lvert\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right\rvert\le1.\quad (6) \end{align}

Together (3)-(6) are the sufficient conditions for applying the dominated convergence theorem, allowing us to commute all limits and integrals in (2), yielding our desired result:

\begin{align} \lim_{\varepsilon\rightarrow0}\int_{\mathcal{X}} \varphi(x)p_\varepsilon(x)dx&=\lim_{\varepsilon\rightarrow0}\left(\frac{\int_{\mathcal{U}} \varphi(x^*-\sqrt{\varepsilon}u)\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)du}{\int_\mathcal{U}\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)du}\right),\\ &=\frac{\int_{\mathcal{U}}\lim_{\varepsilon\rightarrow0}\left( \varphi(x^*-\sqrt{\varepsilon}u)\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right)du}{\int_\mathcal{U}\lim_{\varepsilon\rightarrow0}\left(\exp\left(\frac{g(x^*-\sqrt{\varepsilon}u)}{\varepsilon}\right)\right)du},\\ &=\frac{\int_{\mathcal{U}} \varphi(x^*)\exp\left(u^\top\nabla^2 f(x^*)u\right)du}{\int_\mathcal{U}\exp\left(u^\top\nabla^2 f(x^*)u\right)du},\\ &=\varphi(x^*)\frac{\int_{\mathcal{U}} \exp\left(u^\top\nabla^2 f(x^*)u\right)du}{\int_\mathcal{U}\exp\left(u^\top\nabla^2 f(x^*)u\right)du},\\ &=\varphi(x^*). \end{align}