Optimising the grouping of elements of discrete sets.

121 Views Asked by At

Suppose I have three sets $$ A = \{a_1, a_2, a_3, a_4\}\\ B=\{b_1, b_2, b_3, b_4\}\\ C=\{c_1, c_2, c_3, c_4\} $$

A function $$f(x_i, y_i)=r : x_i,y_i\in \{A, B, C\}, r \in ℝ$$ is defined between the elements of the sets.

Consider the element $b_1$. If one were to calculate $f(b_1, a_i) \forall a_i \in A $, there will be one particular $a_i$ for which the function will be minimum. Let's say it was $a_3$. That is, $f(b_1, a_3) < f(b_1, a_i)$ for all other $i$. We say that $b_1$ and $a_3$ can be paired together.

Similarly, there is $c_i \in C$ as well, such that the function will yield the minimum when paired with $b_1$. Let's say that it was $c_4$. If $c_4$ pairs with $b_1$, and $b_1$ is already paired with $a_3$, it is guaranteed that $c_3$ with pair with $a_3$ if the function is computed between $c_3$ and all elements of $A$.

Hence we create one group from the original sets namely, $G_1 = \{a_3, b_1, c_4\}$.

Similarly, the task is to create four (there are four elements in the sets) different groups out of the different possible groupings of the elements such that the pairwise function between the elements of the group will be most optimal.

Now a simple algorithm for doing this would be fix one set, let's say $A$ and loop over the elements of $B$ and $C$. For every element in $B$, find the one in $A$ which minimises the function and group them together. Repeat for $C$ and you are done.

This is good if the data is perfect and there actually is a pairing between all the elements of the list. But sometimes the data is corrupted, missing or otherwise just messy and you end up having two elements of $B$ matched to one particular element in $A$.

Suppose while looping through $B$, initially $b_1$ matched with $a_3$ but so did $b_3$. Now, to resolve this conflict, we can of course pick out which function, $f(b_1, a_3)$ or $f(b_3, a_3)$ is lesser and attach that to $a_3$. Suppose that turned out to be $b_3$. But now there is the question, what do I pair with $b_1$ (or do I pair anything at all because that's also an option). Of course, you can go and loop through all the elements of $A$ again, leaving out $a_3$ and do the pairing. But implementing this in a program gets really messy at this point.

I was wondering if there are any known algorithms for handling these kinds of grouping problems elegantly. Maybe some way of rephrasing this as a path finding problem in a graph or as a weight minimising problem or something.

If someone can tell me which direction to take here, which topics should I be looking into, that would be great. Thanks!