String with substrings in ascending order

227 Views Asked by At

Suppose you have several substrings, e.g.,

AEBCA
DEA
BDAEBAE
EDEBAE
AED

How would you create a minimal-length string that contains the substrings in order?

The context, real or contrived (don't know), was this: Suppose you have several trucks, each of which must be loaded with items in a specific order. The substrings are the loads for each truck, and the letters are the items.

The trucks load by queuing in front of a line of bins, with each bin having only items of like type. The trucks can move forward only. In what order should the bins be arranged to have the minimum number?

My solution (http://www.excelforum.com/excel-programming-vba-macros/1086058-counting-letters-in-strings.html) was to catenate the substrings (24 characters in this example), and then strike every letter that left the substrings ascending in the reduced string. That gave me the 11-character string below, and the substrings are repeated below it:

AEBDAEBAECA
    AEB  CA
   D    E A
  BDAEBAE  
 E D EBAE  
AE D       

Is this an instance of a more general problem that I could research by name? Can anyone suggest a less naive algorithm that might generate a better result?

Thank you.

1

There are 1 best solutions below

1
On

You're looking for the shortest common supersequence (scs). According to Wikipedia,

For two input sequences, an scs can be formed from a longest common subsequence (lcs) easily. For example, if $X[1..m]=abcbdab$ and $Y[1..n]=bdcaba$, the lcs is $Z[1..r]=bcba$. By inserting the non-lcs symbols while preserving the symbol order, we get the scs: $U[1..t]=abdcabdab$.

It is quite clear that $r+t=m+n$ for two input sequences. However, for three or more input sequences this does not hold. Note also, that the lcs and the scs problems are not dual problems.

For the more general problem of finding a string, $S$ which is a superstring of a set of strings $S_1,S_2,\ldots,S_l$, the problem is NP-Complete. Also, good approximations can be found for the average case but not for the worst case.