Algorithm for finding lowest average grade difference

197 Views Asked by At

Given a class with n boys and n girls, in which the girls recieved the grades P1,...,Pn, and the boys recieved the grades S1,...,Sn in an exam, find a pairing of girl-boy in a way that minimizes the average difference between the grades in the couples.

For example, if p1=30, p2=60, s1=50, s2=90, we should pair girl #1 with boy #1 (20 points difference) and girl #2 with boy #2 (30 points difference), and we will get a minimal average difference of (30+20)/2 = 25.

Prove that the following algorithm is optimal: Pair the girl with the lowest grade to the boy with the lowest grade. Then pair the girl with the second lowest grade to the boy with the second lowest grade, etc.


In my solution, I tried using the greedy choice property (showing that there exist an optimal solution where a certain element is in the solution, and then using induction to prove that all elements are in the optimal solution) :

Let A1<=...<=An be the girl's grades sorted, and B1<=...<=Bn be the boys' grades sorted.

Claim - There exists an optimal solution which includes the pair A1-B1 (the boy with the lowest grade paired to the girl with the lowest grade).

Proof - Assume by contradiction that the statement is false. Therefore, no optimal solution includes A1-B1 as a pair. Assume A1-Bi (i>1) and B1-Aj (j>1) are pairs in the solution. We know that A1<=Aj and B1<=Bi. How do I continue from here ?

Thanks in advance.

2

There are 2 best solutions below

0
On

You could do this by case analysis, considering various cases of the relative order of $A_1$, $A_j$, $B_1$, and $B_i$. Here is a somewhat technical solution using a geometric approach.

Imagine $A_1$, $A_j$, $B_1$, and $B_i$ as points on the number line. Let $f:[0,1]\to \mathbb{R}$ and $g:[0,1]\to \mathbb{R}$ be defined as follows: $$f(t) = A_1 \cdot t + B_i \cdot (1-t) \\ g(t) = A_j \cdot t + B_1 \cdot (1-t)$$ Then $f$ and $g$ are straight-line parametrizations of the paths from $A_1$ to $B_i$ and $A_j$ to $B_1$, respectively. Let $|\cdot|$ denote path length over $[0,1]$. Then $|f| = |A_1 - B_i|$ and $|g| = |A_j-B_1|$.

Let $h = g-f$. Then $h(0) = A_j-A_1 \geq 0$, and $h(1) = B_1-B_i \leq 0$. Since $h$ is continuous, there is some $t_0$ such that $h(t_0) = 0$, which means $f(t_0) = g(t_0)$.

Let $$\tilde{f}(t) = \left\{ \begin{array}{ll} f(t) & t\leq t_0 \\ g(t) & t \geq t_0 \end{array} \right. \mbox{,}$$ and let $$\tilde{g}(t) = \left\{ \begin{array}{ll} g(t) & t\leq t_0 \\ f(t) & t \geq t_0 \end{array} \right.\mbox{.}$$ Then $\tilde{f}$ is a path from $A_1$ to $B_1$, and $\tilde{g}$ is a path from $A_j$ to $B_i$, such that $|\tilde{f}| + |\tilde{g}| = |f| + |g|$. From this we can conclude that $$|A_1 -B_1| + |A_j-B_i| \leq |\tilde{f}| + |\tilde{g}| = |f| + |g| = |A_1-B_i| + |A_j-B_1|\mbox{.}$$ This shows that by pairing $A_1$ with $B_1$ and $A_j$ with $B_i$, we can get at least as good a solution as before, contradicting the initial claim.

0
On

See my latest response here. I think the last I wrote (Dec 14 at 14:53) is a rigorous proof.
First I prove it for N=2, then try to prove it for N, see to what it is equivalent,
and then I do induction on N. Btw, if it's not good, I would be glad to hear where
my flaw is.

https://stackoverflow.com/questions/20575447/finding-lowest-average-grade-difference/20584730