Large deviation theory standard normal cdf

107 Views Asked by At

enter image description here

I'm reading this paper, and followed the derivation up to 2.4. However, I am not able to understand how they derived 2.4. This implies that $$\lim_{N \rightarrow \infty} - \frac{2}{N} \log \left( \Phi\left(\frac{D\sqrt{N}}{\sigma\sqrt{T}}\right)\right) = \frac{D^2}{2\sigma^2 T} $$

Please could someone help with this derivation?

2

There are 2 best solutions below

0
On

This is not an answer, but rather a long comment. Hopefully it can be of help.

Regarding my original comment : what I meant is that if we denote by $\Phi$ the standard Gaussian cdf defined as $\Phi(x) :=\mathbb P (Z\le x)$ where $Z\sim\mathcal N(0,1)$, it follows that $\lim_{x\to\infty}\Phi(x)=1$.
Hence $\lim_{x\to\infty}\log(\Phi(x))=\log(1)=0$, and similarly $\lim_{x\to\infty}\log(\Phi(x))/x=0 $, which is why I think that there is a problem somewhere.


If we overlook the above "detail", here is a lead on where the result comes from : first recall that $\Phi$ is given by the expression $$\Phi(x) = \frac 1 2 \left[1+\text{erf}\frac{x}{\sqrt 2}\right] $$ Where $\text{erf}$ represents the error function. Furthermore, on the same Wikipedia page, we are given that the complementary error function $\text{erfc} := 1-\text{erf}$ has the following asymptotic expansion : $$\text{erfc}(x)= \frac{1}{x\sqrt{\pi}}e^{-x^2} \cdot \sum_{k=0}^\infty (-1)^k~\frac{(2k-1)!!}{(2x^2)^k } $$

Given that all higher order terms become negligible as $x\to\infty$, we deduce the following asymptotic equivalent : $$ \text{erfc}(x)\sim_{x\to\infty}\frac{1}{x\sqrt{\pi}}e^{-x^2}$$ Which gives the following asymptotic equivalents for $\text{erf}$ and $\Phi$ : $$\begin{cases}\text{erf}(x)\sim_{x\to\infty}1-\frac{1}{x\sqrt{\pi}}e^{-x^2}\\ \Phi(x)\sim_{x\to\infty}1-\frac{1}{2x\sqrt{\pi}}e^{-x^2/2}\end{cases} $$ Plugging in $x:=\frac{D\sqrt{N}}{\sigma\sqrt{T}}$ in the equivalent of $\Phi$ yields $$\Phi\left(\frac{D\sqrt{N}}{\sigma\sqrt{T}}\right)\sim_{N\to\infty}1-\frac{\sigma\sqrt{T}}{2D\sqrt{\pi N}}\exp\left(-\frac{D^2 N}{2\sigma^2 T}\right) $$ At this point, you would want to compose with $\log$ and multiply by $2/N$, but you will find out (e.g. by remembering that $\log(1-h)\sim_{h\to0}-h$) that the limit is equal to $0$, again...

However, if we had replaced $\Phi$ by $\tilde\Phi:=1-\Phi$, we would get $$\tilde\Phi\left(\frac{D\sqrt{N}}{\sigma\sqrt{T}}\right)\sim_{N\to\infty}\frac{\sigma\sqrt{T}}{2D\sqrt{\pi N}}\exp\left(-\frac{D^2 N}{2\sigma^2 T}\right) $$ Implying $$\log\left[\tilde\Phi\left(\frac{D\sqrt{N}}{\sigma\sqrt{T}}\right)\right]\sim_{N\to\infty}-\frac{D^2 N}{2\sigma^2 T} + \log\left(\frac{\sigma\sqrt{T}}{2D\sqrt{\pi N}}\right) $$ And finally $$\lim_{N\to\infty}\frac{-2}{N}\log\left[\tilde\Phi\left(\frac{D\sqrt{N}}{\sigma\sqrt{T}}\right)\right]=\frac{D^2}{2\sigma^2 T} $$ Which is the "desired result".


My thoughts : Everything after the However part is not valid with respect to the claim we want to prove, and I doubt the original authors of the paper (which is quite a famous paper apparently) have made a mistake, but at the same time it seems to be the only way to resolve the paradox !

What I suspect is that either :

  • There is some underlying convention/notation/terminology that I'm unaware of which implies that $\Phi$ should really be understood as $1-\Phi$,
  • Or the $\le$ symbol has to be changed into a $>$ symbol at some point (I don't know much about large deviations theory, but I recall, as can be seen on its Wiki page that usually in that field, results are of the form $``\mathbb P(M_N > x)\le\exp(-NI(x)"$)
  • Or I have made a mistake somewhere (very likely too !)

At this point I leave it to you or another more knowledgeable person to figure out the solution.

1
On

I think the paper calls CDF what is actually the tail function $P(Z>t)$.

You have very well known bounds on the Gaussian tail function https://www.johndcook.com/blog/norm-dist-bounds/.

These imply in particular that $\log(\Phi(t)) \sim -t^2/2$ as $t\to+\infty$ which does give equation (2.4).

You have wrongly added a factor $2$ in your last display.