Maximum product of spacings estimation involves maximizing this function:
$$ S_n(\theta) = \frac1{n+1}\sum_{i=1}^{n+1} \log D_i(\theta),\quad D_i(\theta) = F(x_{(i)};\theta) - F(x_{(i-1)};\theta) $$
Here, $F(x;\theta)\in[0,1]$ is a cumulative distribution function (CDF). $x_{(i)}>x_{(i-1)}\;\forall i$, so $D_i(\theta)>0$. Also, since $F(x;\theta)$ is a CDF, its differences (spacings) $D_i(\theta)$ must add to one: $\sum_{i=1}^{n+1}D_i(\theta)=1$.
Papers introducing this method say that $S_n(\theta) < -\log(n+1)$:
- (Cheng et al, sec. 2, p. 397) says: "$\log G$ [which is what I call $S_n(\theta)$] is always bounded by $-\log(n+1)$".
- (Ekström, sec. 2) says that $-\frac1{n+1}\sum_{i=1}^{n+1}\log((n+1)D_i)$ is bounded from below by $0$, which again implies that $S_n(\theta)$ is bounded from above by $-\log(n+1)$.
How to compute this bound?
I proceed like this:
- $F(x;\theta)\in[0,1]\;\forall x,\theta$.
- $x_{(i)}>x_{(i-1)}\;\forall i$, so the difference of $F$s should be in the same interval: $D_i(\theta)\in[0,1]$. I guess this interval is too wide? I also don't know whether I should account for the fact that $D_i$s sum to $1$. If $D_i\in[0,1]$, then $\sum_{i=1}^{n+1}D_i\in[0,n+1]$, which is true, but this interval is too wide.
- If $D_i(\theta)\in[0,1]$, then $\log D_i(\theta)\in[-\infty,0]$.
- The average of these logs is in the same interval, so $S_n(\theta)\in[-\infty,0]$.
But the upper bound should be $-\log(n+1)<0$. Where does it come from?
References
- Cheng, R.C.H. and Amin, N.A.K. (1983) ‘Estimating Parameters in Continuous Univariate Distributions with a Shifted Origin’, Journal of the Royal Statistical Society: Series B (Methodological), 45(3), pp. 394–403. Available at: https://doi.org/10.1111/j.2517-6161.1983.tb01268.x.
- Ekström, M. (2008) ‘Alternatives to maximum likelihood estimation based on spacings and the Kullback–Leibler divergence’, Journal of Statistical Planning and Inference, 138(6), pp. 1778–1791. Available at: https://doi.org/10.1016/j.jspi.2007.06.031.