I thought up this expected value problem:
Two hundred people are each assigned an integer from 1-100 so that there is a pair of each number. They are standing outside of a large waiting room. A random person will enter the room, one at a time. If their pair is there, they both continue on their way, otherwise they wait in the room until their pair arrives.
This goes on until there are no people left outside of the waiting room.
Question: What is the expected value of what will be the greatest sum of the values in the waiting room? A pair that immediately leaves doesn't contribute to the sum.
My thoughts: The lower bound is clearly 100, if everyone got their pair immediately. The upper bound is 1+2+3+...+100 = 5050, if there were no pairs for the first 100 people. In the simplified case of 4 people (1,1,2,2) there are 8 orders that will lead to a 2 and 1 being in the room at the same time and 16 orders of 2 being alone, so the expected value should be $\frac{8(2+1)+16(2)}{4!} = 2 \frac 13$. How can this idea be applied to the larger group?
I interpret it as the following: let $(_1,…,_N)$ be random (multi-)permutation of $(1,1,2,2,…,N,N)$. Let $_=\{i \leq n|\forall \leq j \neq i \implies _i \neq x_j\}$ and $$f(n)=\sum_{i \in A_n}x_i.$$ What is $E[\max_n(f(n))]$?
I do not have a full answer, but I will give a lower bound. Define random variables $X_{i,n} = i$ if exactly one copy of $i$ appears in the permutation in the first $n$-elements, and otherwise $0$. Then $f(n) = \sum_{i=1}^N X_{i,n}$.
Equivalently, we may say that $X_{i,n} = 0$ iff either both copies of $i$ are in the first $n$ elements of the list, or neither are.
The probably that they are both in the first $n$ elements is $$\frac{{n \choose 2}}{{2N \choose 2}}$$ since there are ${n \choose 2}$ such pairs out of ${2N \choose 2}$ total. Similarly, the probability that both pairs do not appear in the first $n$ elements is $$\frac{{2N-n \choose 2}}{{2N \choose 2}}.$$ Thus
$$P(X_{i,n} = i) = 1-\frac{{n \choose 2}+{2N-n \choose 2}}{{2N \choose 2}}$$
and so $$E[X_{i,n}] = i\left(1-\frac{{n \choose 2}+{2N-n \choose 2}}{{2N \choose 2}}\right).$$
Then \begin{align*} E[f(n)] &= E[\sum_{i=1}^N X_{i,n}]\\\\ &= \sum_{i=1}^N E[X_{i,n}]\\\\ &= \left(\frac{N(N+1)}{2}\right)\left(1-\frac{{n \choose 2}+{2N-n \choose 2}}{{2N \choose 2}}\right). \end{align*}
Now since $\max f(n) \geq f(i)$ for any $i$, we have $E(\max(f(n)) \geq E[f(i)]$ for any $i$. Choosing $i = N$ and simplifying then gives
$$E[\max f(n)] \geq \left(\frac{N}{2N-1}\right)\left(\frac{N(N+1)}{2}\right).$$
In a simulation, taking $100$ random permutations for each $N$ from $1$ to $500$, the bound seems not far off as $N$ becomes large:
For the particular case $N=100$ the bound is about 9% low.