How to show that this language is decidable

167 Views Asked by At

The question :

The language is $L=$ { $< G,w > $ : $G$ is a grammar in normal form of Chomsky and $w$ is a word on the terminal alphabet that can be derivated in $G$ by at least 2 differents derivations trees. }

In other words, find an algorithm that when given a grammar that is in normal form of Chomsky and a word $w$, it concludes after a finite number of steps if yes or no $w$ can be derivated by two derivations trees.

My reflexion and attempt:

Since the grammar is in Chomsky normal form, we know that it will take exactly $2|w|-1$ steps of derivation. So everything is done in a finite time space.

So we have to take our string and be able to derivate it with two derivation tree.

The part i am not sure in the algorithm is how to say that we are switching the derivations rules used ( so we don't get the same derivation tree).

Here's my attempt but it seems odd:

Entry : < G, w >;

  1. Try every derivations in $G$ of length $2|w| -1$.
  2. If two derivations generate w stop and accept < G,w >. Else stop and reject < G,w >.

Thanks a lot for your help.

1

There are 1 best solutions below

5
On

I’m afraid the “try every derivations” part of your attempt is a bit imprecise. The idea is there, but how would you do it? Well, there is a well-known dynamic programming algorithm that “tries every derivation” in time $O(n^3)$: the CYK algorithm.

The algorithm, from Wikipedia, is as follows:

let the input be a string I consisting of n characters: a1 ... an.
let the grammar contain r nonterminal symbols R1 ... Rr, with start symbol R1.
let P[n,n,r] be an array of booleans. Initialize all elements of P to false.

for each s = 1 to n
    for each unit production Rv → as
        set P[1,s,v] = true

for each l = 2 to n -- Length of span
    for each s = 1 to n-l+1 -- Start of span
        for each p = 1 to l-1 -- Partition of span
            for each production Ra → Rb Rc
                if P[p,s,b] and P[l-p,s+p,c] then set P[l,s,a] = true

if P[n,1,1] is true then
    I is member of language
else
    I is not member of language

However, in this basic form, the CYK algorithm is a mere recognizer – it will only tell you if a word belongs to the grammar or not. To get what we’re after, we need to do some bookkeeping. Instead of booleans, we can store the parse trees in the $P[l, s, v]$ array. Here is a modification to the pseudo-code:

let the input be a string I consisting of n characters: a1 ... an.
let the grammar contain r nonterminal symbols R1 ... Rr, with start symbol R1.
let P[n,n,r] be an array of lists. Initialize all elements of P to the empty list.
let TerminalNode(Rv,as) denote a terminal node with production Rv → as
let NonTerminalNode(Ra,Rb,Rc) denote a nonterminal node with production Ra → Rb Rc

for each s = 1 to n
    for each unit production Rv → as
        append TerminalNode(Rv,as) to P[1,s,v]

for each l = 2 to n -- Length of span
    for each s = 1 to n-l+1 -- Start of span
        for each p = 1 to l-1 -- Partition of span
            l_cell = P[p,s,b]
            r_cell = P[l-p,s+p,c]
            for each production Ra → Rb Rc
                l_nodes = [node in l_cell | node.symbol = Rb]
                r_nodes = [node in r_cell | node.symbol = Rc]
                for each l_node in l_nodes
                    for each r_node in r_nodes
                        append NonTerminalNode(Ra,l_node,r_node) to P[l,s,a]

Now, $P[n, 1, 1]$ tells you the entire parse forest that was used to derive the string. From here, you can easily determine if the derivation was ambiguous or not.