Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 20081 CSE3302 Programming Languages,

Similar presentations


Presentation on theme: "CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 20081 CSE3302 Programming Languages,"— Presentation transcript:

1 CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 20081 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

2 What is Parsing? Given a grammar and a token string: – determine if the grammar can generate the token string? – i.e., is the string a legal program in the language? In other words, to construct a parse tree for the token string. Lecture 4 – Syntax (Cont.), Spring 20082 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

3 What’s significant about parse tree? A parse tree gives a unique syntactic structure Leftmost, rightmost derivation There is only one leftmost derivation for a parse tree, and symmetrically only one rightmost derivation for a parse tree. Lecture 4 – Syntax (Cont.), Spring 20083 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

4 Example expr  expr + expr | expr  expr | ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr number digit 3 * expr number digit 4 expr number digit 5 expr + parse tree leftmost derivation expr numberdigit 3 numberdigit 4 numberdigit 5 expr + * Lecture 4 – Syntax (Cont.), Spring 20084 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

5 What’s significant about parse tree? A parse tree has a unique meaning, thus provides basis for semantic analysis. (Syntax-directed semantics: semantics are attached to syntactic structure.) expr expr1 expr2 + expr.val = expr1.val + expr2.val Lecture 4 – Syntax (Cont.), Spring 20085 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

6 Relationship among language, grammar, parser Chomsky Hierarchy http://en.wikipedia.org/wiki/Chomsky_hierarchy A language can be described by multiple grammars, while a grammar defines one language. A grammar can be parsed by multiple parsers, while a parser accepts one grammar, thus one language. Should design a language that allows simple grammar and efficient parser For a language, we should construct a grammar that allows fast parsing Given a grammar, we should build an efficient parser Lecture 4 – Syntax (Cont.), Spring 20086 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

7 Ambiguity Ambiguous grammar: There can be multiple parse trees for the same sentence (program) – In other words, multiple leftmost derivations. Why is it bad? – Multiple meanings Lecture 4 – Syntax (Cont.), Spring 20087 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

8 Ambiguity Was this ambiguous? How about this? number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  expr - expr | number Lecture 4 – Syntax (Cont.), Spring 20088 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

9 Deal with Ambiguity Unambiguous Grammar – Rewrite the grammar to avoid ambiguity. Lecture 4 – Syntax (Cont.), Spring 20089 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

10 Example of Ambiguity: Precedence expr  expr + expr | expr  expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr 3 * 4 5 + parse tree 1 Two different parse tress for expression 3+4*5 expr 5 * parse tree 2 + 3 expr 4 Lecture 4 – Syntax (Cont.), Spring 200810 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

11 Eliminating Ambiguity for Precedence Establish “precedence cascade”: using different structured names for different constructs, adding grammar rules. – Higher precedence : lower in cascade expr  expr + expr | term term  term  term | ( expr ) | number expr  expr + expr | expr  expr | ( expr ) | number Lecture 4 – Syntax (Cont.), Spring 200811 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

12 Example of Ambiguity: Associativity expr  expr - expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr 5 - 2 1 - parse tree 1 Two different parse tress for expression 5-2-1 expr 1 - parse tree 2 - 5 expr 2 expr.val = 4 expr.val = 2 - is right-associative, which is against common practice in integer arithmetic - is left-associative, which is what we usually assume Lecture 4 – Syntax (Cont.), Spring 200812 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

13 Associativity Left-Associative: + - * / Right-Associative: = What is meant by a=b=c=1? Lecture 4 – Syntax (Cont.), Spring 200813 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

14 Eliminating Ambiguity for Associativity left-associativity: left-recursion expr  expr - term | term term  (expr) | number expr  expr - expr | ( expr ) | number right-associativity: right-recursion expr  expr = expr | a | b | c expr  term = expr | term term  a | b | c Lecture 4 – Syntax (Cont.), Spring 200814 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

15 Putting Together expr  expr - expr | expr / expr | ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  expr - term | term term  term / factor | factor factor  ( expr ) | number number  number digit | digit digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 We want to make - left-associative and / has precedence over - Lecture 4 – Syntax (Cont.), Spring 200815 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

16 Example of Ambiguity: Dangling-Else stmt  if( expr ) stmt | if( expr ) stmt else stmt | other-stmt Two different parse trees for “if( expr ) if( expr ) stmt else stmt” stmt (expr) if stmt (expr) parse tree 1parse tree 2 stmt else stmt (expr) if stmt (expr)stmt else Lecture 4 – Syntax (Cont.), Spring 200816 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

17 Eliminating Dangling-Else stmt  matched_stmt | unmatched_stmt matched_stmt  if( expr ) matched_stmt else matched_stmt | other-stmt unmatched_stmt  if( expr ) stmt | if( expr ) matched_stmt else un matched_stmt Lecture 4 – Syntax (Cont.), Spring 200817 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

18 EBNF Repetition { } Option [ ] expr  expr - term | term if-stmt  if ( expr ) stmt | if ( expr ) stmt else stmt if-stmt  if ( expr ) stmt [ else stmt ] number  digit | number digitnumber  { digit } signed-number  sign number | number signed-number  [ sign ] number expr  term  term  Lecture 4 – Syntax (Cont.), Spring 200818 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

19 Syntax Diagrams Written from EBNF, not BNF If-statement (more examples on page 101) if else statement expression statement () if-statement Lecture 4 – Syntax (Cont.), Spring 200819 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

20 Parsing Techniques Intuitive analysis and conclusion No formal theorems and rigorous proofs More details: compilers, automata theory Lecture 4 – Syntax (Cont.), Spring 200820 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

21 Parsing Parsing: – Determine if a grammar can generate a given token string. – That is, to construct a parse tree for the token string. Two ways of constructing the parse tree – Top-down (from root towards leaves) Can be constructed more easily by hand – Bottom-up (from leaves towards root) Can handle a larger class of grammars Parser generators tend to use bottom-up methods Lecture 4 – Syntax (Cont.), Spring 200821 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

22 Top-Down Parser Recursive-descent parser: – A special kind of top-down parser: single left-to-right scan, with one lookahead symbol. – Backtracking (trial-and-error) may happen Predictive parser: – The lookahead symbol determines which production to apply, without backtracking Lecture 4 – Syntax (Cont.), Spring 200822 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

23 Recursive-Descent Parser Types in Pascal type  simple | array [ simple ] of type simple  char | integer Input: array [ integer ] of char Parse tree type simple array type [] of integer char simple Lecture 4 – Syntax (Cont.), Spring 200823 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

24 Challenge 1: Top-Down Parser Cannot Handle Left-Recursion expr  expr - term | term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Input: 3 – 4 – 5 Parse tree expr term - exprterm - expr term - Lecture 4 – Syntax (Cont.), Spring 200824 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

25 Eliminating Left-Recursion void expr(void) { term(); while (token == ‘-’) { match(‘-’); term(); } expr  expr - term | term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  term { - term } term  0 | 1 | … | 9 ( EBNF ) expr  term expr’ expr’  term expr’ | ε term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Lecture 4 – Syntax (Cont.), Spring 200825 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

26 Challenge 2: Backtracking is Inefficient Backtracking: trial-and-error type  simple | array [ simple ] of type simple  char | integer Input: array [ integer ] of char Parse tree simple char type simple array type [] of X integer (Types in Pascal ) Lecture 4 – Syntax (Cont.), Spring 200826 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

27 Challenge 2: Backtracking is Inefficient We cannot avoid backtracking if the grammar has multiple productions to apply, given a lookahead symbol. Solution: – Change the grammar so that there is only one applicable production that is unambiguously determined by lookahead symbol. subscription  term | term.. term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Lecture 4 – Syntax (Cont.), Spring 200827 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

28 Avoiding Backtracking by Left Factoring expr  term | term.. term term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr  term rest rest  term | ε term  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 A   A   A’ A’  Lecture 4 – Syntax (Cont.), Spring 200828 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

29 Left Factoring Using EBNF expr  term @ expr | term expr  term [@ expr] if-statement  if ( expr ) statement | if ( expr ) statement else statement if-statement  if ( expr ) statement [ else statement ] Lecture 4 – Syntax (Cont.), Spring 200829 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008


Download ppt "CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 20081 CSE3302 Programming Languages,"

Similar presentations


Ads by Google