Minimizing Discrete OR's

45 Views Asked by At

Given a collection of boolean vectors $v_1 ... v_j \in \lbrace 0, 1 \rbrace^n$ (0 = false, 1 = true) How to quickly find the smallest subset of indices $\lbrace a_1 .. a_k \rbrace$ such that $$\bigvee_{i=1}^{k} v_{a_i} = \lbrace 1 \rbrace^n $$

Example: Given $[1 1 1 0 1], [1 0 0 1 1], [0 0 0 1 1] , [0 0 0 1 1 1]$ one acceptable solution would be $[1 1 1 0 1], [1 0 0 1 1]$

Work So Far:

So far the best I can seem to do is to form a tree (each node containing a set of vectors whose element-wise OR is all 1s) , where we start with the root node containing the whole set, and mark the root node as "alive".

From here I have a rule that, if there is a node $N$ marked "alive" then consider its set of vectors $V_n$ then consider each one-element-smaller-subset $V_j$ of $V_n$ such that $\bigvee_{u \in V_j} [u] = \lbrace 1\rbrace^n $ Then make an edge from the node containing $V_n$ to the new node containing $V_j$. Mark the node containing $V_j$ as alive. And once you've processed all the one-element-smaller-subset's of the node containing $V_n$, mark the node containing $V_n$ as dead.

Repeat this until all nodes are dead. Find the node whose set is smallest (if there's more than one pick at random) and return.

Some naive analysis reveals the worst case run time of this algorithm is bounded by $O(j!)$ (since there are worst case j splits from the root, then (j-1) splits from those, etc.... )

How to make this faster?

Work So Far 2:

Another algorithm I considered was to sort the vectors into buckets (that aren't necessarily disjoint) where vectors $v$ are members of bucket $j$ if $v[j] = 1$ (it's j^th component is a 1). Then find the bucket containing the smallest number of vectors. Consider every element in this bucket, place them into nodes. For each node $N$, make a copy of all the buckets $B_N$, let $v_N$ denote its vector, remove all the buckets $k$ that such that $v_N[k] = 1$ . Now for remaining buckets in $B_N$ we find the one containing fewest elements, and create a node (with an edge from leading from $N$ to this node), recursively repeat this procedure on all the child nodes formed.

If instead of doing this "depth first" we build this structure out "breadth first", then we can terminate as soon as we find a path of nodes whose vector's element-wise OR is all 1's. If the length of this path is $T$ then we know the run time depends on $T$. But analyzing this procedure tricky so I'm not sure how to proceed. (And my gut feeling isn't very optimistic) I think it will be something exponential in $T$ on this one since we are building a tree of depth $T$.