Cardinality of Intersection and Union of Multiple Sets Given Overlap coefficient(s)

379 Views Asked by At

I would like to reason about the intersection and union of a number of sample sets and an assumed similarity between them. The calculation doesn't have to be exact, it can be a reasonable estimate.

Specifically, given the cardinalities of individual single sets, and an assumed overlap coefficient (Szymkiewicz-Simpson coefficient), I would like to calculate the union and intersection.

For two sets A,B the overlap coefficient $C_{(A,B)}$ is:

$$C_{(A,B)} = \frac{|A \cap B|}{min(|A|,|B|)}$$ I can calculate the intersection and unoin like this: $$|A \cap B| = min(|A|,|B|) \times C_{(A,B)} \\ |A \cup B| = |A|+|B| - min(|A|,|B|) \times C_{(A,B)} $$

How can I generalize this for a finite number of sets assuming the same overlap coefficient? What about a matrix of overlap coefficients? Can you recommend a resource that discusses these calculations?

ie: given the cardinalities of sets $A_1 ...,A_n$ and some similarity coefficient $C$ which applies to all sets, how do I calculate:

$$|\bigcup^n_{i=1}A_i| \\ |\bigcap^n_{i=1}A_i|$$