How to measure similarity of partitions / partitioning?

1.4k Views Asked by At

Suppose a set of elements of finite size.

E.g.:

$X = \left\lbrace a,b,c,d,e,f,g \right\rbrace$

There are several ways to partition $X$.

E.g:

$P_1 = \left\lbrace \left\lbrace a,b \right\rbrace, \left\lbrace c \right\rbrace, \left\lbrace d,e,f,g \right\rbrace \right\rbrace$

$P_2 = \left\lbrace \left\lbrace c,f,g \right\rbrace, \left\lbrace a,e \right\rbrace, \left\lbrace b,d \right\rbrace \right\rbrace$

$P_3 = \left\lbrace \left\lbrace a,b,c \right\rbrace, \left\lbrace d,e,f,g \right\rbrace \right\rbrace$

How can one measure the similarity of those partitionings?

In this example, $P_1$ and $P_3$ are similar.

I think I can measure similarity of individual partitions (e.g. using [Jaccard similarity](https://en.wikipedia.org/wiki/Jaccard_index $\rightarrow$ $J(\left\lbrace a,b \right\rbrace, \left\lbrace a,b,c \right\rbrace)=\frac{2}{3}$), but I have no idea on how to use this information to find which partitionings are similar.

Any idea?

Hint:

Given a pair of partitionings $P_1$ and $P_2$, we can construct a weighted bipartite graph from elements of $P_1$ to elements of $P_2$, that is $G=\left\langle P_1 \cup P_3, P_1 \times P_3 \right\rangle$, where each weight is:

$w_{i,j} = J(p_{1,i},p_{2,j}) = \frac{\left\vert\ p_{1,i}\ \cap\ p_{2,j}\ \right\vert}{\left\vert\ p_{1,i}\ \cup\ p_{2,j}\ \right\vert}$

where $p_{1,i} \in P_1$ and $p_{2,j} \in P_2$

The two most similar partitionings are the solution to the maximum weighted bipartite matching problem.


Sorry if my vocabulary is not correct, my background is not in Mathematics.

1

There are 1 best solutions below

0
On BEST ANSWER

Based on your suggestion, given two partitions $P_1$ and $P_2$, you could construct a complete bipartite graph, where $P_1$ and $P_2$ are the parts, and assign to every edge the Jaccard similarity weight.

Taking the average weight (for example) of the edges in a maximum matching (which you can find using the augmented path algorithm or max-flow) will give you a number in $[0,1]$.

Partitions also correspond with Young tableaux and conjugacy classes of permutations which could also be used to describe similarity (e.g. distance in the Cayley graph of $S_n$, with transpositions as generators, between representatives of each conjugacy class).