Law of Large Number for Stochastic Processes

249 Views Asked by At

Consider the following Stochastic Process:

$$B(t)∼N(μt,σt)$$

Here is a simulation for multiple possible trajectories of this Stochastic Process (R Code):

enter image description here

library(ggplot2)
library(dplyr)

## Helper function to make the initial increment to be 0
first_to_zero <- function(v) {
    v[1] <- 0
    v
}

n_sep <- 101
n_iter <- 1000
time_length <- 10
mu <- 0.5  # Define the drift term

data2 <- data.frame(id  = rep(seq_len(n_iter), each = n_sep),
                    x   = rep(seq(from = 0, to = time_length, length.out = n_sep), n_iter),
                    z = rnorm(n = n_sep * n_iter, mean = 0, sd = sqrt(time_length / n_sep))) %>%  # Change the standard deviation to sqrt(time_length / n_sep)
    group_by(id) %>%
    mutate(z = first_to_zero(z),
           y   = mu * x + cumsum(z))  # Add the drift 

ggplot(data = data2, mapping = aes(x = x, y = y, group = id)) +
    geom_line(alpha = 1 / (n_iter / 100)) +
    scale_x_continuous(breaks = seq(0, time_length, by = 0.5)) +
    geom_vline(xintercept = seq(0, time_length, by = 0.5), linetype="dotted", color = "red", alpha = 2) +  
    theme_bw() + theme(legend.key = element_blank()) 

Suppose we are interested in estimating the parameters of this Stochastic Process (i.e. $σ$ and $μ$) based on a single trajectory of this Stochastic Process (i.e. via Maximum Likelihood Estimation).

My Question:

  • The more we observe this process (i.e. for larger values of $t$) - will our estimates for $σ$ and $μ$ improve as $t$ gets larger? In other words (note that the MLE estimators are functions of $n$, i.e. $t$) does it mean that

$$\lim_{{n \to \infty}} P\left( \left| \mu - \hat{\mu}_{MLE} \right| > \epsilon \right) = 0$$ $$\lim_{{n \to \infty}} P\left( \left| \sigma - \hat{\sigma}_{MLE} \right| > \epsilon \right) = 0$$

  • If I (whimsically and) incorrectly treated this Stochastic Process as i.i.d Gaussian - I wonder if it is possible to compare the estimates calculated using a likelihood function based on the Stochastic Process vs the estimates calculated from the (incorrect) likelihood function based on a Normal Distribution? Can the maximum error be bounded?

At first I thought the answer is not certain (i.e. longer observation periods might not necessarily always result in better estimation) because the Stochastic Process has non-constant and non-finite variance, thus invalidating the Law of Large Numbers and Central Limit Theorem ... but then I thought that perhaps the Stochastic Process can be re-parametrized (https://en.wikipedia.org/wiki/Donsker%27s_theorem) in terms of the Weiner Process (independent Gaussian increments), and thus the Law or Large Numbers would apply, suggesting that larger observation periods would bring parameter estimates closer to their actual values (in probability).

I think a similar argument is true for Central Limit Theorem, i.e. for larger values of $t$ (i.e.$n$), the sampling distribution of the MLE estimates approach a standard Normal Distribution? (I think this is incorrect because there are many versions of the CLT which might be better suited for this situation e.g. CLT based on Martingales)

$$\sqrt{n}(\hat{\mu}_{MLE} - \mu) \xrightarrow{d} N(0, \sigma^2)$$ $$n(\hat{\sigma}^2_{MLE} - \sigma^2) \xrightarrow{d} \chi^2(n-1)$$

Any thoughts on this?

Thanks!

Sources:

2

There are 2 best solutions below

2
On BEST ANSWER

As the assumptions are not fully clear, I address the OP for some different cases:

Case 1: $B(t)$ independent for all $t$ and $B(t)∼N(μt,σ^2t^2)$

The answer is yes based on a single trajectory one can accurately estimates the parameters. If the index set is countable, having the trajectory for a longer period is helpful; otherwise; having it over a small interval is enough.

Let a single trajectory of $B(t)∼N(μt,σ^2t^2)$ be given at time instants $t=1, ..., T$, denoted by $B_1,..., B_T$. Then, you can use $B_1/1,...,B_T/T$ as a simple random sample from $N(μ,σ^2)$, which are iid, with sample size $T$. Then, you can use standard methods available for point or interval estimation of $μ$ and $σ$. Indeed, here, as $T$ increases, the accuracy level increases.

However, if the index set is a continuum, from a single trajectory over any small interval, you can accurately determine the parameters (almost surely) because you can take a sample of size infinity from the trajectory (Remark: this case is of interest only in thorey, as a trajectory cannot be given for any $t\in[0,1]$ in practice).

Case 2: $B(t)$ independent for all $t$ and $B(t)∼N(μt,σ^2t)$

In this case, one can use $X_1=B_1/1,...,X_T=B_T/T$ as independent RVs whose distributions are $X_i=N(μ,\frac{σ^2}{i}), i=1,...,T$, respectively. Here, the ML method can be used for the following non-standard likelihood function:

$$ l(\mu,\sigma^2)= \prod_{i=1}^{T} \frac{1}{ \sqrt{2 \pi \frac{\sigma^2}{i}} } \exp \left[ \frac{(x-\mu)^2}{-2\frac{\sigma^2}{i}} \right].$$

Case 3: $B(t)∼N(μt,σ^2t^2)$ is a continuous Gaussian process

If $B(t)∼N(μt,σ^2t^2)$ is a continuous Gaussian process, it is impossible to accurately estimate the parameters based on a single trajectory of the process. Note that $B(t)$ has the following linear kernel:

$$C(s,t)=ts\sigma^2.$$

This means that for all $t$,

$$B(t)=tZ$$

where $Z$ follows the normal distribution $N(μ,σ^2)$. In fact, for any $t \neq s$ the correlation of $B(t)$ and $B(s)$ is 1 . Hence, all points on a single trajectory are linearly generated based a unique observation from $N(μ,σ^2)$.

Case 4: $B(t)∼N(μt,σ^2t)$ is a continuous Gaussian process

If $B(t)∼N(μt,σ^2t)$ is a continuous Gaussian process, it is possible to accurately estimate the parameters based on a single trajectory of the process. In this case, we have the following representation:

$$B(t)=t\mu+ \sigma W(t)$$

where $W(t)$ denotes the Wiener process. The kernel $C(s,t)$ of $B(t)$ is given by

$$C(s,t)=\min\{t,s\}\sigma^2,$$

which implies that non-overlapping increments of $B(t)$ are independent (because their covariances are zero). Then, considering $$\Delta B(t)=B(t+\Delta t)-B(t)∼N(μ\Delta t,σ^2\Delta t)$$ by equally discretizing an interval and computing the increments we can generate a simple random sample (of any size) from $N(μ\Delta t,σ^2\Delta t)$.

0
On

Assume that the process considered is given by $B_t = \sigma W_t + \mu t$ where $W$ is a standard Brownian motion, and assume that you observe $B_0=0$ and $B_{k\Delta t}$ for $1 \le k \le n$. Then the increments $B_{(k+1)\Delta t}-B_{k\Delta t}$ are i.i.d. with distribution $\mathcal{N}(\mu\Delta t,\sigma^2\Delta t)$. So you are led to the estimation of both parameters of a $n$-sample of a normal distribution on $\mathbb{R}$.