Expected Value of the maximum of a set of random variables (not indep)

838 Views Asked by At

As a preface, this is related to a graph of $n+1$ nodes, which is generated through (equally likely) random attachments to existing nodes - this occurs in $n$ "stages" of attachment. Suppose that we have a set of $k= $ $n \choose 2$ discrete random variables $X_{i} \in \{1,...,n\}$ (for $1 \leq i \leq k$) that are identically distributed, but not independent, and for which we know the marginal distribution (presumably the frequency function, which we will call $f_{X}$, but you can use the moment generating function $M_{X}$ or the cumulative distribution function $F_{X}$ if absolutely necessary) for any given $X_{i}$ and the joint distribution (call its frequency $f$, the moment generating function $M$, and the cumulative distribution function $F$). We need to find the expected value of the maximum of this set. In other words, we should compute $$\mathbb{E}( \text{max}\{{X_{1},X_{2},...,X_{k}\}}).$$ An approximation for large $n$ would also be helpful. I believe that this should involve the distribution of order statistics coupled with the known distribution, but I'm not quite sure what approach to take here. The reason I am asking is because I am generally unfamiliar with order statistics, especially in regards to expectation, although I have seen some derivations of the distribution of these random variables with some independence assumptions and some proofs involving order statistics, specifically in the texts $\textit{A First Course in Probability}$ by Sheldon Ross (Chapter $6$) and $\textit{Mathematical Statistics and Data Analysis}$ by John Rice (Chapter $3, 4$). For instance, I have shown that the expectation of the $j$th-order statistic of a sample of $n$ independent random variables $(X_{1},X_{2},...,X_{n})$ uniform on $[0,1]$, which we denote $X_{(j)}$, is given by $$\mathbb{E}(X_{(j)})=\frac{j}{n+1},$$ where $X_{(1)} \leq X_{(2)} \leq \,... \leq X_{(n)}$. Indeed, the marginal distributions are of the Beta family, which is where the result comes from.

It might be useful to have the following fact, although it is not completely applicable here. Suppose $X_{1},X_{2},...,X_{n}$ are i.i.d. discrete random variables with CDF $F$ and PMF $f$. Consider the set of order statistics $\{X_{(i)} \,|\, X_{(i)}=X_{j} \,\,\text{for some}\,\, 1 \leq i,j \leq n, \,\, X_{(1)} \leq X_{(2)} \leq \,... \leq X_{(n)}\}$. The PMF of the $k$th-order statistic, $X_{(k)}$, is given by $$P(X_{(k)}=x)=\sum_{j=1}^{n-k} {n \choose j} (\,(1-F(x))^{j}(F(x))^{n-j}-(1-F(x)+f(x))^{j}(F(x)-f(x))^{n-j}\,)$$

If clarification is needed for the question at hand, I can add to this post and answer questions below. I am looking for a closed form general solution, if possible. In our case, with the above notation, we are trying to determine $$\mathbb{E}(X_{(k)}),$$ where it is implicit that $X_{(k)}=\text{max}\{X_{1},X_{2},...,X_{k}\}$ from the definition of these order statistics.