If I have two sets, i can calculate similarity coefficient of them using Jaccard index. Is there algorithm i can calculate similarity with variable number of entities? For example, let's say we have first pair of sets:
{A1,B1,C1} and {A1,B2,C1,D1}
{A1,B1,C1} and {A1,B3,C4,D5}
I can say that first pair is more similar, but how to calculate it mathematically?
The Jaccard index $\displaystyle J(A,B) = {{|A \cap B|}\over{|A \cup B|}} = {{|A \cap B|}\over{|A| + |B| - |A \cap B|}}$ seems to handle differently sized sets as part of its definition
In your examples it would give $\dfrac25$ and $\dfrac16$ and the first value is certainly higher than the second