CSE 3302 Programming Languages

Slides:



Advertisements
Similar presentations
Parsing II : Top-down Parsing
Advertisements

lec02-parserCFG March 27, 2017 Syntax Analyzer
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
EECS 6083 Intro to Parsing Context Free Grammars
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Syntax and Semantics Structure of programming languages.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Context-Free Grammars and Parsing
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages,
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Syntax and Semantics Structure of programming languages.
1 COMP313A Programming Languages Syntax Analysis (2)
CPS 506 Comparative Programming Languages Syntax Specification.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Top-Down Parsing.
Syntax Analyzer (Parser)
Syntax and Semantics Structure of programming languages.
Introduction to Parsing
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
lec02-parserCFG May 8, 2018 Syntax Analyzer
Parsing — Part II (Top-down parsing, left-recursion removal)
Parsing & Context-Free Grammars
Programming Languages Translator
CS510 Compiler Lecture 4.
Lexical and Syntax Analysis
Lecture #12 Parsing Types.
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Parsing IV Bottom-up Parsing
Chapter 3 – Describing Syntax
Parsing — Part II (Top-down parsing, left-recursion removal)
Compiler Construction
Parsing with Context Free Grammars
CS 363 Comparative Programming Languages
4 (c) parsing.
Parsing Techniques.
CSE 3302 Programming Languages
Lecture 3: Introduction to Syntax (Cont’)
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
Compiler Design 4. Language Grammars
Lexical and Syntax Analysis
(Slides copied liberally from Ruth Anderson, Hal Perkins and others)
Syntax-Directed Definition
Programming Language Syntax 2
CSE 3302 Programming Languages
COP4020 Programming Languages
Chapter 2: A Simple One Pass Compiler
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Parsing IV Bottom-up Parsing
Introduction to Parsing
Introduction to Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
BNF 9-Apr-19.
Parsing & Context-Free Grammars Hal Perkins Summer 2004
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
lec02-parserCFG May 27, 2019 Syntax Analyzer
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
COMPILER CONSTRUCTION
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

CSE 3302 Programming Languages Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What is Parsing? Given a grammar and a token string: determine if the grammar can generate the token string? i.e., is the string a legal program in the language? In other words, to construct a parse tree for the token string. Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What’s significant about parse tree? A parse tree gives a unique syntactic structure Leftmost, rightmost derivation There is only one leftmost derivation for a parse tree, and symmetrically only one rightmost derivation for a parse tree. Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example expr ® expr + expr | expr  expr | ( expr ) | number number ® number digit | digit digit ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr number digit 3 * 4 5 + parse tree leftmost derivation expr expr + * expr number digit number number 3 digit digit 4 5 Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

What’s significant about parse tree? A parse tree has a unique meaning, thus provides basis for semantic analysis. (Syntax-directed semantics: semantics are attached to syntactic structure.) expr expr.val = expr1.val + expr2.val expr1 + expr2 Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Relationship among language, grammar, parser Chomsky Hierarchy http://en.wikipedia.org/wiki/Chomsky_hierarchy A language can be described by multiple grammars, while a grammar defines one language. A grammar can be parsed by multiple parsers, while a parser accepts one grammar, thus one language. Should design a language that allows simple grammar and efficient parser For a language, we should construct a grammar that allows fast parsing Given a grammar, we should build an efficient parser Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Ambiguity Ambiguous grammar: There can be multiple parse trees for the same sentence (program) In other words, multiple leftmost derivations. Why is it bad? Multiple meanings Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Ambiguity Was this ambiguous? How about this? number ® number digit | digit digit ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr ® expr - expr | number Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Deal with Ambiguity Unambiguous Grammar Rewrite the grammar to avoid ambiguity. Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Precedence expr ® expr + expr | expr  expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Two different parse tress for expression 3+4*5 parse tree 2 parse tree 1 expr 5 * expr 3 * 4 5 + expr expr + 3 4 Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Ambiguity for Precedence Establish “precedence cascade”: using different structured names for different constructs, adding grammar rules. Higher precedence : lower in cascade expr ® expr + expr | expr  expr | ( expr ) | number expr ® expr + expr | term term ® term  term | ( expr ) | number Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Associativity expr ® expr - expr | ( expr ) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Two different parse tress for expression 5-2-1 parse tree 2 parse tree 1 expr 1 - expr 5 - 2 1 expr.val = 2 expr.val = 4 expr expr - 5 2 - is right-associative, which is against common practice in integer arithmetic - is left-associative, which is what we usually assume Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Associativity Left-Associative: + - * / Right-Associative: = What is meant by a=b=c=1? Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Ambiguity for Associativity left-associativity: left-recursion expr ® expr - expr | ( expr ) | number expr ® expr - term | term term ® (expr) | number right-associativity: right-recursion expr ® expr = expr | a | b | c expr ® term = expr | term term ® a | b | c Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Putting Together expr ® expr - expr | expr / expr | ( expr ) | number number ® number digit | digit digit ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr ® expr - term | term term ® term / factor | factor factor ® ( expr )| number We want to make - left-associative and / has precedence over - Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Example of Ambiguity: Dangling-Else stmt ® if( expr ) stmt | if( expr ) stmt else stmt | other-stmt Two different parse trees for “if( expr ) if( expr ) stmt else stmt” parse tree 1 parse tree 2 stmt stmt (expr) if if (expr) stmt else stmt if (expr) stmt stmt else Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Dangling-Else stmt ® matched_stmt | unmatched_stmt matched_stmt ® if( expr ) matched_stmt else matched_stmt | other-stmt unmatched_stmt ® if( expr ) stmt | if( expr ) matched_stmt else unmatched_stmt Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

EBNF Repetition { } Option [ ] number ® digit | number digit expr ® expr - term | term expr ® term {- term} signed-number ® sign number | number signed-number ® [ sign ] number if-stmt ® if ( expr ) stmt | if ( expr ) stmt else stmt if-stmt ® if ( expr ) stmt [ else stmt ] Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Syntax Diagrams Written from EBNF, not BNF If-statement (more examples on page 101) if else statement expression ( ) if-statement Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Parsing Techniques Intuitive analysis and conclusion No formal theorems and rigorous proofs More details: compilers, automata theory Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Parsing Parsing: Two ways of constructing the parse tree Determine if a grammar can generate a given token string. That is, to construct a parse tree for the token string. Two ways of constructing the parse tree Top-down (from root towards leaves) Can be constructed more easily by hand Bottom-up (from leaves towards root) Can handle a larger class of grammars Parser generators tend to use bottom-up methods Top-down: Breadth-First Bottom-up: Depth-First Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Top-Down Parser Recursive-descent parser: A special kind of top-down parser: single left-to-right scan, with one lookahead symbol. Backtracking (trial-and-error) may happen Predictive parser: The lookahead symbol determines which production to apply, without backtracking Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Recursive-Descent Parser Types in Pascal type ® simple | array [ simple ] of type simple ® char | integer Input: array [ integer ] of char Parse tree type simple array type [ ] of Recursion: when something refers to itself. Example in English: “This statement is false.” integer simple char Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 1: Top-Down Parser Cannot Handle Left-Recursion expr ® expr - term | term term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Input: 3 – 4 – 5 Parse tree expr term - expr term - expr expr term - Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Eliminating Left-Recursion expr ® expr - term | term term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr ® term expr’ expr’ ® - term expr’ | ε term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 ( EBNF ) void expr(void) { term(); while (token == ‘-’) { match(‘-’); term(); } expr ® term { - term } term ® 0 | 1 | … | 9 Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 2: Backtracking is Inefficient Backtracking: trial-and-error type ® simple | array [ simple ] of type simple ® char | integer (Types in Pascal ) Input: array [ integer ] of char Parse tree type simple simple array type [ ] of char integer X Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Challenge 2: Backtracking is Inefficient subscription ® term | term .. term term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 We cannot avoid backtracking if the grammar has multiple productions to apply, given a lookahead symbol. Solution: Change the grammar so that there is only one applicable production that is unambiguously determined by lookahead symbol. Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Avoiding Backtracking by Left Factoring A ®  A’ A’ ® 1 | 2 expr ® term | term .. term term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 expr ® term rest rest ® .. term | ε term ® 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008

Left Factoring Using EBNF expr ® term @ expr | term expr ® term [@ expr] if-statement ® if ( expr ) statement | if ( expr ) statement else statement if-statement ® if ( expr ) statement [ else statement ] Lecture 4 – Syntax (Cont.), Spring 2017 CSE3302 Programming Languages, UT-Arlington ©Chengkai Li, Weimin He, Weimin He 2008