Presentation is loading. Please wait.

Presentation is loading. Please wait.

Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.

Similar presentations


Presentation on theme: "Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w."— Presentation transcript:

1 Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w be derived in G? Both of these problems are decidable - that is, there are algorithms which will give a definite (correct) yes or no answer for any given instance of the problems. Parsing is important, because understanding the derivation of a structure helps us to understand the meaning of the structure.

2 Derivation Structure Consider the expression in the language G 0 : a +( a * a) In order to process this expression, it helps to consider the (a*a) substring as a more significant sub-unit than a+(a, for example. We can use the derivation of the string: 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

3 Derivation Structure Consider the expression in the language G 0 : a +( a * a) In order to process this expression, it helps to consider the (a*a) substring as a more significant sub-unit than a+(a, for example. We can use the derivation of the string: S => S+S => S+(S) => S+(S*S) => S+(S*a) => S+(a*a) => a+(a*a). S S + S ( S ) S * S aa a 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

4 Derivation Trees For any derivation, we can construct a derivation tree. The root of the tree will be a node representing the start symbol. Every time we apply a production A -> , we add a subtree below A A is the root, and there is a branch for every symbol of , in the same left-to-right order in which they appear in . We read the string represented by the derivation tree by reading the "leaf" nodes in left-to-right order. Note: "left-to-right" order means the "structural" order - the leftmost path, then the same path, but with the next-to-left branch at the last node where there was a choice, etc. - and not any order which may appear in the sketch.

5 S => S+S => S+(S) => S+(S*S) => S+(S*a) => S+(a*a) => a+(a*a). S S S + S ( S ) => S S + S S S + S ( S ) S * S S S + S ( S ) S * S aa S S + S ( S ) S * S aa a S S + S ( S ) S * S a

6 Equivalent Derivations Two different derivations can have the same derivation tree. Example: S => S+S => S+a => a+a and S => S+S => a+S => a+a both produce the tree S S + S a a In CFG's, the order of applying productions is irrelevant, as long as the same production is applied to the same symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

7 Multiple Derivation Trees Consider the two derivations below: 1. S => S+S => S+S*S => S+S*a => S+a*a => a+a*a 2. S => S*S => S*a=> S+S*a => S+a*a => a+a*a These give essentially different derivation trees for the same final sentence. S S a + S S * S a a 1. S S a+ S S * S a a 2. This causes problems for our attempt to understand a string by considering its derivation. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

8 Ambiguous Grammars A derivation in which at each step the rightmost non-terminal is replaced is a right derivation. In a right derivation, the order of symbols to be replaced is fixed. A string has two different right derivations iff it has two different derivation trees. A CFG is ambiguous if there is at least one string in L(G) having two or more different right derivations (or, equally, two or more different derivation trees).

9 The Problem With Ambiguity By the previous example, the grammar of algebraic expressions, G 0, is ambiguous. Problem: 2+2*2 = ? Under derivation 1., we get 2 + (2*2) = 6. Under derivation 2., we get (2+2)*2 = 8. Which do we select? Why is this a problem? Suppose we are attempting to analyse strings in the language of G 0, in order to perform simple arithmetic - the structure of the derivation will tell us which operation to apply when. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

10 Unambiguous Expressions We are aiming to produce an unambiguous version of G 0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a. We will do this by introducing new symbols - a term, T, will represent a product; a factor, F, will represent things that can be multiplied; and S will represent sums. An expression can be a sum of an expression and a term, or simply a term. A term can be a product of a term and a factor, or simply a factor. A factor can be an expression (in parentheses), or simply a symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

11 Unambiguous Expressions We are aiming to produce an unambiguous version of G 0. Essentially, we want to assign priorities to the operators, and reflect this in the grammar. Also, although it makes no difference to the evaluated expression, we want a+a+a to be (a+a)+a. Example: Grammar G 1. S -> S + T | T T -> T * F | F F -> (S) | a We will do this by introducing new symbols - a term, T, will represent a product; a factor, F, will represent things that can be multiplied; and S will represent sums. An expression can be a sum of an expression and a term, or simply a term. A term can be a product of a term and a factor, or simply a factor. A factor can be an expression (in parentheses), or simply a symbol. 1) S -> S + S 2) S -> S * S 3) S -> (S) 4) S -> a.

12 Ambiguity and Decidability The ambiguity we have seen so far has always been a property of the grammar, and not of the langauge. However, there exist languages for which every grammar defining them is ambiguous. Example: {a i b j c k : i = j or j = k } A language for which every defining grammar is ambiguous is inherently ambiguous. More importantly, there is no algorithm which will determine whether or not a given grammar is ambiguous.


Download ppt "Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w."

Similar presentations


Ads by Google