I have a sample data set that looks like this:
x y w
1 1 5 1
2 1 6 2
3 1 7 3
4 2 8 4
5 2 7 5
6 3 5 6
7 4 6 7
8 4 5 8
x
and y
represent indices from datax
and datay
. w
represents a score from comparing datax[x]
with datay[y]
. I want to maximize the total score (or w
) from d
, where each value of x
is matched to at most one value of y
, and vice versa.
The result should look like this:
x y w
1 1 6 2
2 2 7 5
3 3 5 6
4 4 5 8
Where the sum of all w
values is maximized, and each x
and each y
show up only once in the result.
How do I set up this optimization problem?
You must pick $(2,8)\to5$ as this is the only occurrence of 8.
You must pick $(3,5)\to6$ as this is the only occurrence of 3.
You then cannot pick $(2,7)$ so you must pick $(1,7)\to3$ as this is the only remaining occurrence of 7.
You then cannot pick $(1,6)$ so you must pick $(4,6)\to7$ as this is the only remaining occurrence of 6.
So there is no optimization as you have no other options.
So $w=5+6+3+7=21$