Probability of a duplicate when sampling random pairs from a set ("the lunch tag problem")

112 Views Asked by Bumbble Comm At 01 Apr 2026 - 11:30

Consider a set $S$ of $n$ elements. Define a "full pairing" of $S$ as some division of the set into unordered subsets of size 2, such that every element of $S$ is contained in exactly one subset. (For simplicity, suppose $n$ is even.)

Suppose that we conduct a random full pairing of $S$; that is, we randomly assign all elements of $S$ to an unordered pair s.t. every element is assigned to exactly one pair and there are $n/2$ pairs. Suppose further that $k$ unique pairs are already known, and it is our naive hope that this random pairing process does not yield any of these $k$ known pairs.

Question: What is the probability that this random full pairing contains at least one duplicate from the $k$ known pairs?

Example: Suppose $S=\{1,2,3,4,5,6\}$. One full pairing would be $\{\{1,2\},\{3,4\},\{5,6\}\}$. Suppose that we observed this full pairing, hence $k=3$ pairs observed since they are all unique. What is the probability that a new, entirely random full pairing of $S$ yields at least one duplicate compared to the already-observed pairs?

For background: this is inspired by the case of a "lunch tag" program of sorts, where individuals sign up to get randomly paired to have lunch (etc.) together. Let's consider the lunch tag problem, the problem of creating these assignments over and over again, from the same fixed pool, while preventing "repeat tags" if at all possible (i.e. while maximizing new tags). I have been curious about some intuition behind this lunch tag process. For example, as you iterate through the process, how quickly does the number of possible assignments (constrained to those which have NOT been observed/assigned before) shrink to zero? I've done my best above to precisely frame the desired intuition-building question, but apologies if it is unclear.

My thoughts so far: if $k=1$, this doesn't seem too bad. Let's say $P^{*}$ is the known pair, and $P_1, P_2, \dots, P_{N/2}$ are the draws from one random full pairing. Then $pr(\text{duplicate}) = pr(P_1=P^{*} \text{ or } P_2=P^{*} \text{ or } \dots \text{ or } P_{n/2} = P^{*})$ $= pr(P_1=P^{*}) + pr(P_2=P^{*}) + \dots pr(P_{n/2} = P^{*})$ $= \frac{n}{2}pr(P=P^{*})$ for any randomly-drawn $P$. Since there are $C^{n}_2 = \frac{n(n-1)}{2}$ possible pairs, I think that this probability $pr(P=P^{*}) = \frac{2}{n(n-1)}$. Hence the final answer should be $\frac{1}{n-1}$. I'm stuck on how to extend this to $k>1$ though, since that would introduce more than one possible "match" or duplicate, which would make the probabilities dependent and more complicated ... of course I'm not 100% sure this line of thinking is correct anyways! :)

P.S. would also welcome answers that can generalize to $m$-sized subsets (and for simplicity $m$ divides $n$) rather than just pairs!

Edit: I spent some time trying to simulate this process numerically. For each $n$, the code keeps increasing $k$ until it's unable to create the set of unique known pairs of size $k$. At 20 iterations per parameter configuration $(N,k)$, the estimation still appears a bit noisy. But here's what I've found if it may help.

Estimated probability of a random full pairing containing at least one duplicate: view 1, view 2, view 3. Interestingly, it seems that the maximum number of known pairs that can be drawn while still retaining uniqueness is $k_{\text{max}}=n^2/2 -n$. (Quadratic regression on the simulated $k_{max}$ suggested these parameters which resulted in an exact fit.) I thought this was surprising but hopefully it's not too obvious!

Python code (originally a jupyter notebook). It could use some polishing up—e.g. if it were informed by the maximum value on $k$, then it could run much more quickly—but it's what I've got for now.

Original Q&A

Probability of a duplicate when sampling random pairs from a set ("the lunch tag problem")

Related Questions in PROBABILITY

Related Questions in COMBINATORICS

Related Questions in PERMUTATIONS

Related Questions in SAMPLING

Trending Questions

Popular # Hahtags

Popular Questions