I am trying to solve this problem that is similar to set covering but not exactly the same. I have done a fair amount of search but could not find whether it fits to the context of any other problem.
I have a linear sequence of length L. In addition, I have 400 partitionings of sequence L:
Lets say L=100:
Partition 1: [1-11], [12-51], [90-100]
Partition 2: [12-26], [71-76]
...
Partition 400: [27-89], [62-100]
In this example, Partition 1 has 3 subsequences. The subsequences in each partitioning are pairwise disjoint and they do not have to cover [1-100] exactly.
I would like to select the smallest number of partitionings (from 1 to 400) such that a subset of subsequences in these partitionings cover [1-100] exactly. In addition, I do not want any of the subsequences to overlap.
For example the following set of subsequences from 3 partitionings (Partition 1, Partition 2, Partition 400) satisfy this condition:
[1-11]_1, [12-26]_2, [27-89]_400, [90-100]_1
The subscripts indicate which partition the subsequence comes from. As you see, we do not have to use all the subsequences in the partitions.
We can easily check if a set of subsequences satisfy the condition but I do not know if the above solution is the minimum number of partitionings (i.e., are there less than 3 partitionings whose disjoint subsequences cover [1-100]?).
If I am not self-contradicting, this can be rephrased as following: Think of the subsequences (which are given as input) as colored with respect to partition they belong. In the above example, we would have 400 colors. We would like to find a cover for [1-100] using the colored subsequences such that the number of colors we use is minimum and the selected subsequences are pairwise disjoint.
I feel like this looks like knapsack, set cover, coin change, ... problems but I could not incorporate the linearity condition in these. I want to understand if this is NP-complete or NP-hard.
Any help is greatly appreciated, Thank you.
As I described in the comments, my (vaguely) described problem can be rephrased as a "minimum mutually exclusive set cover" problem. To do this, first generate all the non-empty subsets of the partitionings then pool the subsets. For example, Partitioning 1, which has 3 subsequences, would have 7 subsets (2^3-1). These would contain all possible combinations of the 3 subsequences. Similarly, Partition 2 would have 3 subsets, ..., Partition 400 would have 3 subsets. We pool these subsets.
Then we would like to find the minimum number of mutually exclusive subsets that exactly cover, in the example, [1-400] range. Solving this problem would be equivalent to solving the original described problem.
One publication (arxiv.org/pdf/1302.5820.pdf) claims that this problem is NP-hard. This answers my original question about the complexity of the problem.