Is one regular language subset of another?

7k Views Asked by At

Let $L_1$ and $L_2$ be two regular languages given as regular expressions (in this type of tasks it often happens that $L_1 \subseteq L_2$, but vice versa it is false).

Is there a nice way to prove that $L_1 \subseteq L_2$ ? If yes, than do you think you could explain that algorithm?

I had an idea to construct those languages from regexes using Kleene theorem and than prove that every word from $L_1$ can be a prefix or a suffix of some word w $\in$ $L_2$, where the rest of w can be omitted (i.e. in regex representation it is under $^*$ sign).

OK, another idea is to use brute force - just show step by step that any word from $L_1$ is accepted by the DFA corresponding to $r_2$. However, what if there are too many words in $L(r_1)$?

So I don't think these are good ideas.

Example: $$r_1 = (a+ab+bb)(a+b)^* \\ r_2 = aab^*$$ Obviously $L(r_1) \text { is not a subset of } L(r_2)$, but $L(r_2) \subseteq L(r_1)$.

1

There are 1 best solutions below

2
On BEST ANSWER

The basic idea for an automated proof is:

  • convert both languages to their deterministic finite state machines.
  • Calculate the cross product of the finite state machines with the transitions $F: ([x,y],\sigma) => [F_x(x,\sigma), F_y(y,\sigma)]$
  • Collect the list of reachable states, and their acceptivity status from both languages. If there are any
    • [reject, reject] states, then the union of both languages is not the universal language (there is a word that they both reject).
    • [reject, accept] states, then the latter is not a subset of the former.
    • [accept, reject] states, then the former is not a subset of the latter.
    • [accept, accept] states, then the languages are not disjoint.

Other classes: two languages are equivalent if they are both subsets of each other (however, there are other tests for that); One language is a proper subset of another language if it is its subset but not vice versa.