Calculating the probability of at least one of $X_i \subset X$ using Inclusion-Exclusion

149 Views Asked by At

enter image description here

The Problem statement and the solution is mentioned in the image provided. Please explain me how the set is constructed and how do we get the probability in the form of the provided function $f(x)$ using Inclusion-Exclusion.

1

There are 1 best solutions below

0
On BEST ANSWER

Supposing that we have $q$ of these sets $X_j$ we start by constructing the poset for use with PIE. The nodes $Q$ of this poset are subsets of $[q]$ and represent subsets $T \subseteq S$ that contain the elements of $X_j$ with $j\in Q$ plus possibly more, so they could contain the elements of an $X_k$ with $k\notin Q.$ A subset $T$ is represented by the term $z^{|T|}$ so that we will not be using the cardinality of the set of subsets $T$ of $S$ represented at $Q$ but rather a refinement, namely an ordinary generating function of these sets by the number of elements.

We now ask about the generating function in $z$ of the subsets of $S$ that do not contain any of the $X_j$ as a subset. The weight of a node $Q$ of the poset is $(-1)^{|Q|}.$ There are two cases for $T.$ If $T$ does not contain any of the $X_j$ as a subset it will be included only at the base node $Q=\emptyset$ getting total weight $z^{|T|} \times (-1)^{|\emptyset|} = z^{|T|}.$ On the other hand if $T$ contains exactly the elements of the $X_j$ with $j\in P$ (plus possibly some others that do not contribute to form a different $X_j$) for some $P\subseteq [q]$ where $P\ne\emptyset$ then it is included in all nodes $Q$ that represent subsets of $P$ for a total weight of

$$z^{|T|} \sum_{Q\subseteq P} (-1)^{|Q|} = z^{|T|} \sum_{k=0}^{|P|} {|P|\choose k} (-1)^k = 0.$$

We see that indeed the only non-zero contribution originates with those $T$ not containing any $X_j.$ We may now apply PIE knowing that the generating function at $Q$ is

$$z^{|\bigcup_{j\in Q} X_j|} (1+z)^{|S|-|\bigcup_{j\in Q} X_j|} = (1+z)^{|S|} \left(\frac{z}{1+z}\right)^{|\bigcup_{j\in Q} X_j|}.$$

We thus have for the generating function $Y(z)$ of the subsets not containing any of the $X_j$ by PIE the closed form

$$Y(z) = (1+z)^{|S|} \sum_{Q\subseteq [q]} (-1)^{|Q|} \left(\frac{z}{1+z}\right)^{|\bigcup_{j\in Q} X_j|}.$$

Now a set $T$ with $|T|$ elements represented by $z^{|T|}$ has probability $(1-p)^{|S|-|T|} p^{|T|}$ hence the desired probability is given by

$$(1-p)^{|S|} Y(p/(1-p)) \\ = (1-p)^{|S|} (1+p/(1-p))^{|S|} \sum_{Q\subseteq [q]} (-1)^{|Q|} \left(\frac{p/(1-p)}{1+p/(1-p)}\right)^{|\bigcup_{j\in Q} X_j|} \\ = \sum_{Q\subseteq [q]} (-1)^{|Q|} p^{|\bigcup_{j\in Q} X_j|}.$$

We now know the probability of none of the $X_j$ being a subset and hence the probability of at least one being a subset is

$$1 - \sum_{Q\subseteq [q]} (-1)^{|Q|} p^{|\bigcup_{j\in Q} X_j|} = - \sum_{Q\subseteq [q], Q\ne\emptyset} (-1)^{|Q|} p^{|\bigcup_{j\in Q} X_j|}.$$

This is

$$\bbox[5px,border:2px solid #00A000]{ \sum_{Q\subseteq [q], Q\ne\emptyset} (-1)^{|Q|+1} p^{|\bigcup_{j\in Q} X_j|}.}$$

precisely as in the problem statement and we may conclude the argument observing that increasing $p$ will indeed increase the probability of an $X_j$ appearing in $X.$

Remark. We can check the correctness of the formula for $Y(z)$ by comparing the result of enumeration to the closed form. This is shown below, which also includes a plot.

with(combinat);

YX :=
proc(XL)
local XJ, S, q, Q, Y;
option remember;

    S := `union`(op(XL));
    q := nops(XL);

    Y := 0;

    for Q in powerset(q) do
        XJ := `union`(seq(XL[p], p in Q));

        Y := Y + (-1)^nops(Q)
        * (z/(1+z))^nops(XJ);
    od;

    expand(simplify(Y*(1+z)^nops(S)));
end;

YENUM :=
proc(XL)
local S, Y, T, idx;
option remember;

    S := `union`(op(XL));
    Y := 0;

    for T in powerset(S) do
        for idx to nops(XL) do
            if XL[idx] subset T then
                break;
            fi;
        od;

        if idx = nops(XL) + 1 then
            Y := Y + z^nops(T);
        fi;
    od;

    Y;
end;

YPLOT :=
proc(XL)
local S, f;
    S := `union`(op(XL));

    f := expand(simplify(subs(z=p/(1-p), YX(XL))
                         *(1-p)^nops(S)));
    plot(1-f, p=0..1);
end;

Some examples that were tested are:

> YENUM([{1,2,3}, {1,2,3}, {2,5}]);           
                           3      2
                          z  + 5 z  + 4 z + 1

> YX([{1,2,3}, {1,2,3}, {2,5}]);   
                           3      2
                          z  + 5 z  + 4 z + 1

> YENUM([{1,2,3}, {1,2,3}, {4,5,6}, {2,5}]);
                       4       3       2
                    5 z  + 14 z  + 14 z  + 6 z + 1

> YX([{1,2,3}, {1,2,3}, {4,5,6}, {2,5}]);   
                       4       3       2
                    5 z  + 14 z  + 14 z  + 6 z + 1

> YENUM([{1,2,3}, {4,5,6}, {7,8}]);      
                    5       4       3       2
                18 z  + 45 z  + 48 z  + 27 z  + 8 z + 1

> YX([{1,2,3}, {4,5,6}, {7,8}]);   
                    5       4       3       2
                18 z  + 45 z  + 48 z  + 27 z  + 8 z + 1

Here is a sample plot.

> YPLOT([{1,2,3}, {1,2,3}, {4,5,6}, {2,5}]);  

1 +                                                            HHHHHH 
  +                                                        HHHHH      
  +                                                      HHH          
  +                                                   HHH             
0.8                                                 HHH               
  +                                               HHH                 
  +                                             HH                    
  +                                           HH                      
0.6                                         HH                        
  +                                      HHH                          
  +                                    HHH                            
  +                                  HHH                              
0.4                                HHH                                
  +                              HHH                                  
  +                            HHH                                    
  +                         HHH                                       
0.2                      HHHH                                         
  +                   HHHH                                            
  +               HHHHH                                               
  +         HHHHHH                                                    
  **********+--+--+-+--+--+-+--+--+-+--+--+-+--+--+-+--+--+-+--+--+-+ 
0             0.2          0.4           0.6          0.8           1