Efficiently evaluate $\underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}\left[\left(1 - s + s \cdot e^X \right)^{\sqrt{-1} \cdot t}\right]$

281 Views Asked by At

Let $\mu, \sigma, s, t \in \mathbb{R}$ with $\sigma>0$ and $0 \le s \le 1$. Define $$a_{\mu, \sigma, s, t} := \underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}\left[\left(1 - s + s \cdot e^X \right)^{i \cdot t}\right]$$ $$= \frac{1}{\sqrt{2\pi \sigma^2}}\int_{-\infty}^\infty \exp\left(\frac{-(x-\mu)^2}{2\sigma^2} + i \cdot t \cdot \log\left(1-s+s\cdot e^x\right)\right) \mathrm{d}x,$$ where $i^2=-1$.

I would like to be able to efficiently compute the value of $a_{\mu, \sigma, s, t}$ numerically. A closed-form solution seems too good to be true. But I'm hoping for something like a rapidly converging series.

To be a bit more precise, I want a procedure (i.e., implementable on a computer) that, given inputs $\mu,\sigma,s,t,\varepsilon\in\mathbb{R}$, computes $\tilde{a}_{\mu,\sigma,s,t}$ with $|\tilde{a}_{\mu,\sigma,s,t}-a_{\mu,\sigma,s,t}|\le\varepsilon$. The running time of the procedure (assuming basic arithmetic operations take unit time) should be polynomial in $\log(1/\varepsilon)$. An explicit series with exponentially decaying terms would suffice for this. Why do I need such fast runtime? The quantity of interest is exponentially small $|a_{\mu,\sigma,s,t}|\approx\exp(-s^2t^2\sigma^2/2)$, so $\varepsilon$ needs to be exponentially small too (otherwise $\tilde{a}_{\mu,\sigma,s,t}=0$ provides a trivial estimate), but I still want polynomial runtime.


Firstly, $a_{\mu, \sigma, s, t}$ is well defined. It's the expectation of a bounded and continuous function of a Gaussian. In particular, there is an obvious Monte Carlo algorithm for computing this value: Sample $X \gets \mathcal{N}(\mu,\sigma^2)$ and compute $\left(1 - s + s \cdot e^X \right)^{i \cdot t}$; repeat this procedure and average the results. But to get accuracy $\varepsilon$, this algorithm would require roughly $1/\varepsilon^2$ repetitions. I want an algorithm that runs in something like $\log(1/\varepsilon)$ time. Numerical integration methods are a bit faster (roughly $1/\varepsilon$ steps), but that's still not as rapid as I'd like.

There are a few easy special cases:

  • If $t=0$, then $a_{\mu, \sigma, s, t}=1$.
  • If $s=0$, then $a_{\mu, \sigma, s, t}=1$.
  • If $s=1$, then $a_{\mu, \sigma, s, t}=e^{i \cdot \mu \cdot t - \sigma^2 \cdot t^2 / 2}$. (This is just the characteristic function of the Gaussian.)

As a rough approximation $\log(1-s+s\cdot e^x) \approx sx$, whence $a_{\mu,\sigma,s,t} \approx \mathbb{E}[e^{i t s X}] = e^{its\mu - t^2s^2\sigma^2/2}$.


One thing I tried is a binomial series expansion: $$\left(1 - s + s \cdot e^x \right)^{i \cdot t} = \sum_{k=0}^\infty {it \choose k} \cdot (1-s)^{it-k} \cdot s^k \cdot e^{k \cdot x}.$$ But this series only converges if $x<\log(1/s-1)$. In particular, when I try to evaluate this series with a Gaussian $X$, the terms grow exponentially, as $\mathbb{E}[e^{k \cdot X}] = e^{\mu k + k^2 \sigma^2 / 2}$.

Here's another approach: Define $$f(x) := \left(1-s+s\cdot e^x\right)^{i \cdot t} = \exp(i \cdot t \cdot \log(1-s+s\cdot e^x)).$$ Then $a_{\mu,\sigma,s,t} = \mathbb{E}[f(X)]$ for $X \gets \mathcal{N}(\mu,\sigma^2)$. Now let $$\hat{f}(\xi) = \int_{-\infty}^\infty f(x) \cdot e^{-2\pi i \xi x} \mathrm{d}x$$ be the Fourier transform of $f$. By Parseval's, $$a_{\mu,\sigma,s,t} = \mathbb{E}[f(X)] = \int_{-\infty}^\infty \hat{f}(\xi) \cdot \hat{X}(\xi) \mathrm{d}\xi,$$ where $\hat{X}(\xi) = \mathbb{E}[e^{-2\pi i \xi X}] = e^{-2\pi \mu \xi i -2\pi^2 \sigma^2 \xi^2}$ is the Fourier-Stieltjes transform of $X \gets \mathcal{N}(\mu,\sigma^2)$.

Now this would be progress if $\hat{f}$ is easier to work with than $f$. Unfortunately, $\hat{f}$ is not even well-defined because $f$ is not an integrable function. Nevertheless, I do feel like this approach might be salvageable.

Any suggestions for how to approach this would be greatly appreciated!

2

There are 2 best solutions below

1
On

This is not an answer, but I couldn't comment because I wanted to include this picture:

enter image description here

Maple seems like it can do it in time roughly proportional to the number of digits of accuracy. So... as long as you believe it, it's sort of an implicit proof that it can be done in polylog(1/$\epsilon$) time, maybe even $\widetilde{O}$(log(1/$\epsilon$)) time...

0
On

This is not a complete answer to my question, but it might be possible to turn it into one.

A Taylor series in $t$ gives $$a_{\mu,\sigma,s,t} = \underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}\left[\exp\left( i \cdot t \cdot \log \left( 1-s + s \cdot e^X \right) \right) \right] = 1 + \sum_{n=1}^\infty \frac{(i \cdot t)^n}{n!} \underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}\left[\left( \log \left( 1-s + s \cdot e^X \right) \right)^n \right].$$

Since $|\log(1-s+s\cdot e^x)| \le |x|$ for all $x \in \mathbb{R}$, we have $$\left|\underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}\left[\left( \log \left( 1-s + s \cdot e^X \right) \right)^n \right]\right| \le \underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}[|X|^n]$$ $$\le 2^n \cdot \left( \mu^n + \underset{X \gets \mathcal{N}(\mu,\sigma^2)}{\mathbb{E}}[|X-\mu|^n] \right) \le 2^n \cdot (\mu^n + \sigma^n \cdot (n-1)!!).$$ This implies that the Taylor series above converges for all $t \in \mathbb{C}$.

But this leaves us with the challenge of evaluating the moments of $\log(1-s+s\cdot e^X)$ for $X \gets \mathcal{N}(\mu,\sigma^2)$. I don't know how to do this for all $n$ yet, but I can calculate the first moment, which is a start.


First, $$\log(1-s+s\cdot e^{X}) = \log(1 + \exp(X + \log(s/(1-s))) + \log(1-s).$$ To reduce notational clutter, we can absorb the $\log(s/(1-s))$ term into $\mu$. Thus the problem reduces to computing $\mathbb{E}[\log(1+\exp(X))]$ for $X \gets \mathcal{N}(\mu,\sigma^2)$.

Second, $$\log(1+\exp(x)) = \max\{0,x\} + \log(1+\exp(-|x|)) = \frac{x+|x|}{2} + \sum_{k=1}^\infty \frac{(-1)^{k+1}}{k} \exp(-k|x|)$$ for all $x \in \mathbb{R}$. The Taylor series $\log(1+v) = \sum_{k=1}^\infty \frac{(-1)^{k+1}}{k} v^k$ converges for $v \in (-1,1]$, which covers $v=\exp(-|x|)$ for all $x \in \mathbb{R}$. Now we've reduced the problem to computing $\mathbb{E}[|X|]$ and $\mathbb{E}[\exp(-k|X|)]$.

Third, let $f(\kappa) = \mathbb{E}[\exp(-\kappa |X|)]$ for $X \gets \mathcal{N}(\mu,\sigma^2)$. Then $$f(\kappa) = \frac{1}{\sqrt{2\pi\sigma^2}} \int_0^\infty \exp\left( \frac{-1}{2\sigma^2}(x-\mu)^2 - \kappa x \right) \mathrm{d}x + \frac{1}{\sqrt{2\pi\sigma^2}} \int_{-\infty}^0 \exp\left( \frac{-1}{2\sigma^2}(x-\mu)^2 + \kappa x \right) \mathrm{d}x$$ $$= \frac{1}{\sqrt{2\pi\sigma^2}} \int_0^\infty \exp\left( \frac{-1}{2\sigma^2}(x-\mu+\kappa\sigma^2)^2 - \mu\kappa + \kappa^2\sigma^2/2 \right) \mathrm{d}x + \frac{1}{\sqrt{2\pi\sigma^2}} \int_{-\infty}^0 \exp\left( \frac{-1}{2\sigma^2}(x-\mu - \kappa\sigma^2)^2 + \mu \kappa + \kappa^2\sigma^2/2 \right) \mathrm{d}x$$ $$= \exp(- \mu\kappa + \kappa^2\sigma^2/2 ) \cdot \mathbb{P}[\mathcal{N}(\mu-\kappa\sigma^2,\sigma^2) \ge 0] + \exp( \mu \kappa + \kappa^2\sigma^2/2) \cdot \mathbb{P}[\mathcal{N}(\mu+\kappa\sigma^2,\sigma^2) \le 0]$$ $$= \exp(- \mu\kappa + \kappa^2\sigma^2/2 ) \cdot \mathbb{P}[\mathcal{N}(0,1) \ge \kappa\sigma-\mu/\sigma] + \exp( \mu \kappa + \kappa^2\sigma^2/2) \cdot \mathbb{P}[\mathcal{N}(0,1) \ge \kappa\sigma+\mu/\sigma].$$ So we've reduced computing $f(\kappa)$ to computing the (complementary) error function. This is a well-studied function and we can compute it with high precision very accurately. In particular, the error function has a nice Taylor series.

Fourth, all that remains is to compute $\mathbb{E}[|X|]$ for $X \gets \mathcal{N}(\mu,\sigma^2)$. This is easy. In particular, $\mathbb{E}[|X|] = -f'(0)$ and we can compute the derivative of the expression for $f(\kappa)$ above.


This shows that, in principle at least, we can compute the linear coefficient of the Taylor series. I don't know if this approach generalizes to higher order terms; it certainly gets messy. Overall, this doesn't feel like the right way to compute the coefficients. In particular, splitting the integral into two parts (i.e., using the non-differentiable absolute value function) doesn't seem like the right way to do this; the only reason to split it like this is to ensure that the Taylor series in the second step converges. There's probably a better way.

Something that is potentially related is the logit-Normal distribution. This is the distribution of $\frac{1}{1+\exp(-X)}$ for $X \gets \mathcal{N}(\mu,\sigma^2)$. The reason I think this may be related is that the function $\frac{1}{1+\exp(-x)}$ (known as the logistic function) is the derivative of the function we are interested in $\log(1+\exp(x))$. The moments of the logit-Normal distribution have been studied and can be efficiently computed. So that may offer some inspiration for this question.