How to define a Parse Tree, and the equality of parse tree?

140 Views Asked by At

I understand parse tree is used to describe a derive process of a context-free grammar $G=<V_N,V_T,P,S>$.

Thus in my point of view, it is a rooted tree $R=<V,E>$ with some kind of mapping of $f:V\to(V_N\cup V_T)$.

Furthermore, it also need to express the order of the produced string, so there must be an order relation $\prec$ on node of the tree $V$ .

Thus in my point of view, the parse tree need to be defined as $T=<V,E,f,\prec>$

But I would like to know how to precisely define each of the element within them. And furthermore, how to define the equality of 2 trees.

Because if the equality of tree is not formally defined, the ambiguity of a grammar cannot be defined.

Thanks a lot!

1

There are 1 best solutions below

0
On BEST ANSWER

There are some precise formulations of ordered rooted trees on Wikipedia. The first formalism proposed in that article defines

An ordered tree is a structure $(X, ≤_V, ≤_S)$ where $X$ is a non-empty set of nodes and $≤_V$ and $≤_S$ are relations on $X$ called vertical… order and sibling order, respectively.

For a parse tree, as you say, it is necessary to restrict $X$ to being a finite set, and to add a third relation $f: X\to (V_N \cup V_T)$ in order to label the nodes of a parse tree.

I've left out of the above definition a large number of restrictions on the relations which are required for the structure to be a well-formed parse tree. In addition to the restrictions provided in the linked Wikipedia article, there is a restriction of the labels in order that every node correspond to some production in the grammar. None of these restrictions is crucial to the definition of equality.

With that in place, we can say that two parse trees $(X, ≤_V, ≤_S, f)$ and $(X', ≤'_V, ≤'_S, f')$ are equivalent if there is an isomorphism between $X$ and $X'$ which preserves all of the relations.

But that's not the way that grammatical ambiguity is general defined. It is generally defined in terms of leftmost derivations.

Recall that a derivation of the sentence $\alpha \in V_T^*$ is an ordered sequence $\omega_0\to\omega_1\to\omega_2\to...\to\omega_n$ where each $\omega_i \in(V_N\cup V_T)^*, \omega_0 = S, \omega_n = \alpha$ and each $\omega_i$ is produced from $\omega_{i-1}$ by replacing some symbol $A$ in $V_N$ found in $\omega_{i-1}$ with the right-hand side of a production $A\to\nu\in P$. A derivation is a leftmost derivation if, at each step, the replaced non-terminal is the first non-terminal in the sentential form. (The point of this restriction is that every parse tree corresponds to precisely one leftmost derivation, although it might correspond to a number of different derivations.)

For completeness, two derivations are equal if they are equal symbol-for-symbol.

Then a grammar is unambiguous if there is no sentence which can be derived with two different leftmost derivations.