ques -
Regular expression for language containing all strings that start and end with different symbols
i just went through some examples where the RE for above question is a(a+b)*b + b(a+b)*a
but i think it should be (a(a+b)*b)* + (b(a+b)*a)*
. Am i correct or incorrect ?
You're almost correct: not quite, because your language accepts the empty string :) The original is correct. If you used
+
(one or more) rather than outermost*
(zero or more), then the languages denoted by the two REs (yours, with that change, and the original) are the same language.Let's alter your RE as mentioned:$$ (a(a+b)^*b)^+ + (b(a+b)^*a)^+ \tag{1} $$
Clearly, (the language of) (1) contains (that of) the original. To see the reverse inclusion, suppose $w$ is in the language of (1). Without loss of generality assume $w \in L((a(a+b)^*b)^+)$. Then for some $n > 0$, $w = x_1 \dots x_n$ with each $x_i \in L(a(a+b)^*b)$. But then $x_1$ begins with $a$, $x_n$ ends with $b$, and everything in between is just a string in $L((a+b)^*)$. Thus $w \in L(a(a+b)^*b)$, so it's in the language of the original RE.