When can we exchange expectation and maximum for asymptotic results?

696 Views Asked by At

Motivated in the analysis of algorithms, consider the following setup.

Assume we have discrete random variables $X^{(n)}_1, \dots, X^{(n)}_n$ which we can not assume to be identical or independent. The distribution of the $X^{(n)}_i$ can depend on both $i$ and $n$. Let

$\qquad\displaystyle X^{(n)} = \max_{i \in [1..n]} X^{(n)}_i$

the maximum of those. Assume furthermore that we have shown that $\mathbb{E}[X^{(n)}_i] \in O(f(n))$ for all $i$ as $n \to \infty$, so in particular $\mathbb{E}[X^{(n)}_i]$ depends on $n$. Here, $f : \mathbb{N} \to \mathbb{R}$ is a "simple" increasing function, e.g. polynomial or polylogarithmic¹.

Under which conditions can we conclude that

$\qquad\displaystyle \mathbb{E}[X^{(n)}] \in O\bigl(\max_{i \in [1..n]} \mathbb{E}[X^{(n)}_i]\bigr) = O(f(n))$?


  1. Actually, we'd have $\mathbb{E}[X^{(n)}_i] = g(i,n)$ for some "nice" function $g$. Since our interest is in an asymptotic bound in $n$, we drop the dependence on $i$ ensuring that $f(n)$ is an upper bound on $g(i,n)$ for all $i \leq n$ (up to a constant factor).
2

There are 2 best solutions below

3
On BEST ANSWER

(Note: this answer was to an earlier version of the question. My understanding was that the distribution of $X_j$ was fixed. The current version of the question indicates that the parameters of its distribution depend on $n$ also. The bounds still apply, but may be less directly useful for this scenario.)


Since you are presumably interested in how the maximum behaves as $n$ grows, let $X_{(n)}$ denote the $n$-th order statistic, i.e. the maximum among the $n$ random variables. Let $\mu_j = E[X_j]$ and $\sigma_j^2 = \text{Var}[X_j]$ for each $j$, and let $\overline{\mu} = \frac{1}{n}\sum_{j=1}^n \mu_j$. Also let $S^2 = \frac{1}{n}\sum_{j=1}^n (X_j - \frac{1}{n}\sum_{i=1}^n X_i)^2$ denote the sample variance.

It is known (see pp. 48–49 of Arnold and Balakrishnan) that $$ \overline{\mu} + E[S]/\sqrt{n-1} \le E[X_{(n)}] \le \overline{\mu} + E[S]\sqrt{n-1}. $$ Further, Arnold and Groeneveld showed that $$ \overline{\mu} \le E[X_{(n)}] \le \overline{\mu} + \sqrt{\frac{n-1}{n}\sum_{j=1}^n (Var[X_j] + (\mu_j - \overline{\mu})^2)}, $$ if this expression is more useful for your application.

  • B. C. Arnold and N. Balakrishnan, Relations, Bounds and Approximations for Order Statistics. Lecture Notes in Statistics 53. Springer-Verlag, 1989.
  • B. C. Arnold and R. A. Groeneveld, Bounds on expectations of linear systematic statistics based on dependent samples, Mathematics of Operations Research 4 441–447.

If the variables are independent and have the same mean and variance as well, then Gumbel and also Hartley and David showed that $E[X_{(n)}] \le \mu + \sigma(n-1)/\sqrt{2n-1}$, although your last comment indicates this doesn't apply. Some further bounds were derived by Downey.

  • Peter J. Downey, Distribution-free bounds on the expectation of the maximum with scheduling applications, Operations Research Letters 9, 1990, 189–201. doi:10.1016/0167-6377(90)90018-Z

More information seems to be needed about the precise behaviour of $\text{Var}[X_j]$ or $\mu_j$, but this should get you part of the way there.

2
On

The following method can yield bounds stronger than the one András cites but requires even more knowledge about the distribution of the $X^{(n)}_i$. The idea is to use bounds on the tail probabilities of the $X^{(n)}_i$ to bound the tail of their maximum $X^{(n)}$.

We start with a lemma from Cover/Thomas [1] (Lemma 11.9.1, p392 in 2nd edition):

Lemma

Let $Y$ be any random variable and let $M_Y(z)$ be the moment generating function of $Y$, i. e. $M_Y(z) = \mathbb{E}[e^{zY}]$.

Then

$\qquad\displaystyle \Pr[Y \geq a] \leq \frac{M_Y(z)}{e^{za}}$

for all $z \geq 0$.

So if we can find the moment generating function of $X^{(n)}_i$ (e.g. via its probability generating function), we get bounds whose quality we can adjust by choosing both $a = c \cdot f(n)$ and $z$ appropriately. If all goes well, we get a uniform bound of the form

$\qquad \displaystyle \Pr[X^{(n)}_i \geq c \cdot f(n)] \leq \alpha(n) \in o(n^{-1})$.

Note that, in particular, $\alpha$ does not depend on $i$. Then, we can conclude that

$\qquad \displaystyle \Pr[X^{(n)} \geq c \cdot f(n)] \leq \sum_{i=1}^n \Pr[X^{(n)}_i \geq c \cdot f(n)] \leq n \cdot \alpha(n) \in o(1)$

using $\sigma$-subadditivity. From this, the desired bound $\mathbb{E}[X^{(n)}] \in O(f(n))$ follows immediately.


  1. Elements of Information Theory by T.M. Cover and J.A. Thomas