Find CFGs that generate the regular language for all strings with exactly one a or one b

2.5k Views Asked by At

I have tackled this question by applying the Theorem that states "All regular languages are context-free languages. For any FA, there is a CFG such that the language generated by the grammar is the same as the language accepted by the FA."

So I firstly jotted down the regular expression for this language: $(b*ab*) + (a*ba*)$

I then generated an NFA to parse the regular language:

enter image description here

Finally, I managed to put together a CFG as follows:

S -> A b A | C a C
A -> a B | null
B -> a A | null
C -> b D | null
D -> b C | null

I have managed to parse a handful of sentences that I would expect to be in the language:

b b b a b
b b a b b
b a b b b
b a a a a
a b b b b
a b a a a

I would really just appreciate a bit of feedback as to whether I am on the right track or not.

1) I cannot seem to derive a DFA easily so I have chosen to illustrate an NFA in the answer to this question. Is it okay to do this?

2) Is my CFG in the correct format? In other words, should I be using non-terminals before terminals and NULL so often?

I am about to read up about CNF shortly.

Thanks for taking the time.

2

There are 2 best solutions below

1
On BEST ANSWER

Your question title says "exactly one a and exactly one b". It's worth noting that the language you construct is the strings with "exactly one a" or "exactly one b". Assuming that's what you intended, then...

1) There's a systematic way to derive a DFA from an NFA. The states in the DFA represent possible subsets of the NFA. The state transitions indicate how each of the different states transition as a result. In your case, you start out in what could be either A or C. After seeing an 'a', you can be in either A or D; if instead you saw a 'b' you would be in either C or B. And so on. For each possible set of states in the NFA you encounter, create a single state in the DFA.

2) Whether or not the CFG is in the "correct format" totally depends on what you're using it for. For the purposes of communicating the CFG to other humans, that seems like a totally reasonable format. Some computer software, on the other hand, might have constraints that terminals can only come after the non-terminals, or whatever.

Your CFG seems correct, but it could be described significantly more simply as

S -> A b A | B a B A -> a A | Null B -> b B | Null

0
On

ANSWERS:

1) Using an NFA is fine - unless of course this is homework and the question specifically asks for a DFA.

2) The format of your CFG is fine. For describing a CFL, don't worry about whether terminals precede or follow non-terminals, just put them in the right places to generate the right substrings [in more advanced parsing theory, order matters more, because certain parsing techniques tend to require certain grammar forms]. One minor nit: I don't know what thought process led to the productions for A and B, but they are redundant - those rules generate zero or more 'a'-s, which can be done simpler with a single rule, e.g 'A -> a A | null'. Likewise for the productions for C and D.