Importance sampling of finite path of stochastic difference equation

152 Views Asked by At

Before passing to question, let me briefly recap what's importance sampling of random variables is about. Suppose $\xi$ is a real-valued random variable with density $f$, and let $g:\Bbb R\to \Bbb R$ be some function. The task is to use Monte-Carlo to compute the integral $$ \mathsf E[g(\xi)] = \int_\Bbb R g(x)f(x)\mathrm dx \approx \frac1N\sum_{i=1}^N g(\xi^i) $$ where $\xi^i$ are iid random variable distributed as $\xi$. Since the approximation error may have a big variance, the idea of the importance sampling is to change a density of $\xi$ to some $\hat f$ and use that $$ \int_\Bbb R g(x)f(x)\mathrm dx = \int_\Bbb R g(x)w(x)\hat f(x)\mathrm dx $$ where the weighting function is given by the fraction $w = f/\hat f$. As a result, $$ \mathsf E[g(\xi)] = \mathsf E[g(\hat \xi)w(\hat \xi)] \approx \frac1N\sum_{i=1}^N g(\hat\xi^i)w(\hat\xi^i) $$ where $\hat\xi^i$ are iid with densities $\hat f$. Then one runs the optimization problem to find the best choice of $\hat f$, that is the one which minimizes the variance.


In my case I have a similar problem. Let us consider a discrete-time stochastic process $X$ with a state space $E$, given by the stochastic difference equation of the form $$ X_{k+1} = r(X_k,\eta_k), \quad X(0) = x\in E, \tag{1} $$ where $\eta_k$ is a sequence of iid real-valued random variables with some density $h$, and $r$ is a jointly measurable function. Let $\mathsf P$ be the induced probability measure on $E^{n+1}$ and let $A$ be measurable subset of $E^{n+1}$. I am interested in using importance sampling to evaluate $\mathsf P(A)$.

The method of importance sampling described above easily extends to the case of random elements with a range $\Bbb R^m$, when their density is precisely known. In my case $E$ is a subset of $\Bbb R^m$, and I know the function $h$, but it would be almost impossible to get an expression of the density of $X = (X_0,X_1,\dots,X_n)$ since the function $r$ may have an extremely complicated shape. Due to this reason, I restrict myself just to change of $\mathsf P$ which is induced by the change of the distribution of $\eta$: $$ \begin{align} \mathsf P(A) &= \int_{E^{n+1}} 1_A(x_0,\dots,x_n)\mathsf P(\mathrm dx_0\times\dots\times\mathrm dx_n) \\ & = \int_{\Bbb R^n}1_A(R(y_0,\dots,y_n))h(y_0)\dots h(y_{n-1})\mathrm dy_0\times\dots\times \mathrm dy_{n-1} \\ & = \int_{\Bbb R^n}1_A(R(y_0,\dots,y_n))w(y_0,\dots,y_{n-1})\hat h(y_0)\dots \hat h(y_{n-1})\mathrm dy_0\times\dots\times \mathrm dy_{n-1} \\ & = \mathsf E[1_A(\hat X_0,\dots,\hat X_n)w(\hat \eta_0,\dots,\hat \eta_{n-1})] \end{align} $$ where $R$ is a function which for every noise realization $(\eta_0,\dots,\eta_{n-1})$ gives the path of the process $(X_0,\dots,X_n)$ according to $(1)$, and further $\hat h$ is a new density, $\hat \eta_k$ are distributed according to $\hat h$, $$ (\hat X_0,\dots,\hat X_n) \sim R(\hat \eta_0,\dots,\hat\eta_{n-1}) $$ is distributed according to the new distribution of the noise.


My question is the following: in case I need to do importance sampling of $(1)$, I need to perform the change of measure from $\mathsf P$ to some $\hat{\mathsf P}$ in such a way, that I get an explicit shape of the Radon-Nikodym derivative $w = \mathrm d\mathsf P/\mathrm d\hat{\mathsf P}$.

Is the way I described the only one which guarantees the knowledge of explicit shape of $w$? I am pretty sure, there shall be some literature on the importance sampling of stochastic difference equations that deals with such problems, but I didn't find anything particularly appropriate yet.