Asymptotic Distribution of the Cramer-von Mises Test Under Null Hypothesis

128 Views Asked by At

Let $X_1,X_2,\ldots ,X_n$ be a random sample and we want to test if for some $1\leq c \leq n-1$ the distribution of $X_1,X_2,\ldots ,X_c$ and $X_{c+1},X_{c+2},\ldots ,X_n$ are different. We use the two sample Cramer-von Mises test statistic for it and it is defined as $$ W_{n}(c) = \frac{c(n-c)}{n} \int_{-\infty}^\infty \left\{F_c(x) -G_{n-c}(x)\right\}^2 dH_n(x), $$ $F_c$ is the CDF of the first $c$ sample points, $G_{n-c}$ is the CDF of the remaining $n-c$ sample points. The combined empirical distribution function is $$ H_{n}(x) = \frac{cF_{c}(x)+(n-c) G_{n-c}(x)}{n}. $$ Let us call $\hat{c}$ the changepoint and it is defined as $$ \hat{c}=\frac1{n-1}\text{argmax}_{1 \le c \le n-1} W_n(c) $$ Let us assume the null hypothesis holds, i.e. the distribution of $X_1,X_2,\ldots ,X_c$ and $X_{c+1},X_{c+2},\ldots ,X_n$ is the same. Simulations have shown that asymptotically (as $n\to \infty$) the distribution of $\hat{c}$ is concentrated around $0$ and $1$. What would be the best approach to prove this?

Here is the distribution for $n=50$ https://ibb.co/pzwdcgQ and for $n=500$ https://ibb.co/hf9VVdJ

1

There are 1 best solutions below

0
On BEST ANSWER

I would say this is somewhat of a deep question: The basic tools needed to establish asymptotic results of this type are those of weak convergence of empirical processes. In your case, if you would change the normalization $c(n-c)/n$ to something like $[c(n-c)]^\alpha/n$, $\alpha \in (0,1)$, or would calculate the $argmax$ on a trimmed domain, $argmax_{\theta n \le c \le (1-\theta)n}$, $\theta \in (0,1/2)$, then the limiting distribution of $\hat{c}$ may be obtained as the maximal argument of a Gaussian process. A good place to start to learn about results of this type is the text book Limit Theorems in Change-Point Analysis by Miklós Csörgö, Lajos Horváth.