Asymptotic behavior of $\mathbb{E}\left[\exp\left(-|X|^\nu\right)\right]$ wrt the mean of the normal rv $X$

190 Views Asked by At

Here is a statement that I would like to prove:

Let $X \sim \mathcal{N}(\mu, 1)$. Let $\nu >0$, show that $$ -\log \mathbb{E}\left[\exp\left(-|X|^\nu\right)\right] \quad\underset{\mu \to +\infty}{\sim}\quad \mu^\nu $$ where $f(x) \sim g(x)$ means $f(x) = g(x) + o(g(x))$.

I have checked numerically that it is always verified whatever $\nu>0$, but I cannot prove it.

Attempt of proof: Here is what I have tried: \begin{align} \mathbb{E}\left[\exp\left(-|X|^\nu\right)\right] = \sum_{k=0}^\infty \frac{(-1)^k}{k!}\mathbb{E}\left[|X|^{\nu k}\right] \end{align} We have, for any $p>0$ [see here] $$ \mathbb{E}\left[|X|^{p}\right] = \frac{2^{\frac{p}{2}} \Gamma[\frac{1}2 +\frac{p}{2}]}{\sqrt{\pi}} M\left(-\frac{p}2, \frac12, -\frac{\mu^2}2\right) $$ where M is the Kummer function. And we have [see here] $$ M\left(-\frac{p}2, \frac12, -\frac{\mu^2}2\right) \quad\underset{|\mu| \to \infty}{\sim}\quad \frac{\Gamma(1/2) \left(\frac{\mu^2}2\right)^{\frac{p}2}}{\Gamma(\frac12 + \frac{p}2)} $$ Pluging these two together give $$ \mathbb{E}\left[|X|^{p}\right] \quad\underset{\mu \to +\infty}{\sim}\quad \mu^p $$ Now I would like to conclude that \begin{align} \mathbb{E}\left[\exp\left(-|X|^\nu\right)\right] &\quad\underset{\mu \to +\infty}{\sim}\quad \sum_{k=0}^\infty \frac{(-1)^k}{k!}\mu^{k \nu} = \exp\left(-\mu^{\nu}\right) \end{align} but I cannot because I have no control on the error term with respect to $k$ (in order to use dominated or monotone convergence arguments). Maybe this reference could help: here.

Note 1: I have also tried using the Delta method, but I did not succeed.

Note 2: In fact the statement that I need is a bit weaker: $$ \log\left[C_\nu-\log \mathbb{E}\left[\exp\left(-|X|^\nu\right)\right] \right]\quad\underset{\mu \to \infty}{\sim}\quad \nu \log \mu $$ where $C_\nu$ is a constant ensuring the quantity inside the outer log to be positive when $|\mu|>0$.

3

There are 3 best solutions below

2
On BEST ANSWER

Here is an argument for $\leq$:

Fix $v>0$. Define $f:[0,\infty)\rightarrow\mathbb{R}$ by $f(y) = \exp(-y^v)$. Then: \begin{align} f'(y) &= -vy^{v-1} \exp(-y^v)\\ f''(y) &= [(vy^{v-1})^2 + -v(v-1)y^{v-2}]\exp(-y^v) \end{align} Notice that if $0 < v \leq 1$ then $f$ is a convex function over the domain $y \geq 0$. Further, if $v>1$, there is a threshold $\theta>0$ such that $f$ is convex over $[\theta, \infty)$.

Case $0 < v \leq 1$:

By convexity of $f$ for this case we get by Jensen's inequality: $$ E[\exp(-|X|^v)] = E[f(|X|)] \geq f(E[|X|]) = \exp(-E[|X|]^v) $$ Taking $-\log()$ of both sides gives: $$ \boxed{-\log(E[\exp(-|X|^v)]) \leq E[|X|]^v} $$ In fact this holds for any random variable $X$ provided that $E[|X|]$ is finite. Notice that if $X$ is $N(\mu,1)$ and $\mu$ is large then $E[|X|]\approx \mu$. This is because: $$ |X| = X - 2X1\{X<0\} \implies E[|X|] = \mu - 2E[X1\{X<0\}] $$ and $E[X1\{X<0\}]$ is very small when $\mu\rightarrow\infty$.

Case $v>1$:

Recall that $f(y)$ is convex over $y \in [\theta, \infty)$. Define the event $A=\{|X|\geq \theta\}$. Then taking expectations conditioned on $A$ we get by a similar argument: $$ \boxed{-\log(E[\exp(-|X|^v) \: |A]) \leq E[|X| \: | A]^v} $$ Now when $\mu\rightarrow \infty$ we get $P[A]\rightarrow 1$ and also $E[|X|] \approx \mu$ for large $\mu$.

5
On

Here is an approach to the lower bound (see my other answer for the upper bound).

Fix $v>0$, $\mu>1$ and let $X$ be $N(\mu, 1)$. Define the event $B_{\mu} = \{\mu - \sqrt{\mu} \leq X \leq \mu + \sqrt{\mu}\}$ and note that $\lim_{\mu\rightarrow\infty} P[B_{\mu}] =1$. Since $\mu>1$ we know $\mu - \sqrt{\mu}>0$. By Taylor's theorem we have, if $x \in [\mu - \sqrt{\mu}, \mu + \sqrt{\mu}]$: $$ \exp(-x^v) \leq \exp(-\mu^v) + (x-\mu)(-v\mu^{v-1}) \exp(-\mu^v) + \frac{(x-\mu)^2}{2}\exp(-\mu^v)h(\mu) $$ where $h(\mu)$ also depends on $v$ and I will not write it out since I'm lazy...

Hence, conditioning on $B_{\mu}$: $$ E[\exp(-X^v) | B_{\mu}] \leq \exp(-\mu^v) + \underbrace{E[X-\mu | B_{\mu}]}_{=0}(-v\mu^{v-1})\exp(\mu^v) + \underbrace{E\left[\frac{(X-\mu)^2}{2} | B_{\mu} \right]}_{\leq 1/2}\exp(-\mu^v)h(\mu)$$ where the underbrace values hold because the distribution of $X$ is symmetric about $\mu$, and conditioning on being in an interval about $\mu$ reduces the variance. We get: $$ E[\exp(-X^v)|B_{\mu}] \leq \exp(-\mu^v)[ 1 + (1/2)h(\mu)] $$ Taking the $-\log(\cdot)$ of both sides gives $$ \boxed{-\log(E[\exp(-X^v)|B_{\mu}])\geq \mu^v - \log(1 + (1/2)h(\mu))} $$ and if someone is less lazy than me, that person could likely show $\log(1+(1/2)h(\mu))$ is asymptotically negligible in comparison to $\mu^v$.

Then note that for large $\mu$ we have $P[B_{\mu}]\approx 1$ and $X=|X|$ with high probability, leading to the desired result.

5
On

Below I have completed Michael's proof and raise an issue.

1. Recall that $$ f''(\mu) = \left[ \mu^{\nu} + (1 - \nu) \right] \nu \mu^{\nu-2} e^{-\mu^\nu} $$ I have chosen to focus at $x \in [\mu - \mu^\rho, \mu + \mu^\rho]$ for $0 < \rho <1$. And I have define $$h(\mu) = 2 \nu (\mu - \mu^{\rho})^{2\nu-2} \exp(\mu^\nu - (\mu - \mu^{\rho})^{\nu})$$ such that for $\mu$ large enough, $\exp(-\mu^\nu)h(\mu) \geq f''(x)$ for all $x \in [\mu - \mu^\rho, \mu + \mu^\rho]$, and then the targeted inequality holds $$ \exp(-x^v) \leq \exp(-\mu^v) + (x-\mu)(-v\mu^{v-1}) \exp(-\mu^v) + \frac{(x-\mu)^2}{2}\exp(-\mu^v)h(\mu) $$

It remains to show that $\log(1+h(\mu)/2)$ is negligeable for all $\nu >0$ in comparison to of $\mu^\nu$. We have $$ \log(1 + h(\mu)/2) \sim \log(h(\mu)/2) = \log \nu + (2\nu-2)\log(\mu - \mu^{\rho}) + \mu^\nu - (\mu - \mu^{\rho})^{\nu} $$ And the result follows as $$ (\mu - \mu^{\rho})^{\nu} = \mu^\nu (1-\mu^{\rho-1})^{\nu} = \mu^\nu (1+o(1)) = \mu^\nu + o(\mu^\nu) $$ Then following Michael's arguments, the result holds true for all $\nu >0$.

2. Nevertheless, I have an issue. For $\nu=2$, I know that $$ -\log \mathbb{E}\left[\exp\left(-|X|^2\right)\right] = \frac13 \mu^2 + \frac12 \log 3 $$ which contradicts the result because of the factor $1/3$ (convolution of a gaussian of variance 1, with a gaussian of variance 2). See with Maple

-log(int(1/sqrt(2*Pi)*exp(-x^2)*exp(-(x-mu)^2/2), x=-infinity..infinity));

My numerical simulations show that the asymptotical behavior is $\mu^\nu$ for $\nu < 2$, but nor for $\nu =2$. For $\nu > 2$, my simulations are too unstable to make any conclusion. Where is in the proof the reason to discard $\nu = 2$?

We can check with Maple that it works for $\nu=1$ (Maple cannot solve it otherwise)

> limit(-log(int(1/sqrt(2*Pi)*exp(-abs(x)^1)*exp(-(x-mu)^2/2), x=-infinity..infinity))/mu^1, mu=infinity);
1

> limit(-log(int(1/sqrt(2*Pi)*exp(-abs(x)^2)*exp(-(x-mu)^2/2), x=-infinity..infinity))/mu^2, mu=infinity);                                                                          
1/3

Thanks