I have $n$ unknown finite sets $A_1,A_2,...,A_n$. I only know:
- $|A_n| = X_n$
- $ |\cup^n_{x=1}A_x| = X$
- $ |A_x \cup A_y| \leq |A_x| + |A_y|\ with\ x,y \in \{1,...,n\}$
Can I estimate the size of $ \cup^6_{x=0}A_x $ better than using the inequality above.
My current approach would be running a simulation but I would expect a mathematical approach that is simpler and probably a less ad-hoc.
Are you looking for a probabalistic answer?
Without any additional knowledge about the sets, we assume the elements are random. If all the sets were disjoint then we would have $|\cup_{x=1}^n A_x|=\sum_{x=1}^n |A_x|:=Y$, that is $X=Y$. But if we have non-empty intersections, then $X<Y$. Hence, we could say, that the probably a random element from a random set will repeat in another set is $\frac{Y-X}{Y}$.
This would then appear to form a binomial calculation based on how large $n$ is and how large each set is. For example, given a set $A$, how much of $A$ will repeat in another set $B$. Well for each element in $A$ the odds it will repeat in any of the sets is $\frac{Y-X}{Y}$, so the odds it will repeat specifically in $B$ is $\frac{Y-X}{Y}\frac{|B|}{Y-|A|}$.
Continuing in this fashion we could find what the expected size of the union of two random sets is. Similarly we could go through the calculations for 3 random sets unioning, and 4, etc.