Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Slides:



Advertisements
Similar presentations
Lesson 6 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Exercise 1: Balanced Parentheses Show that the following balanced parentheses grammar is ambiguous (by finding two parse trees for some input sequence)
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Chapter 2 Chang Chi-Chung rev.1. A Simple Syntax-Directed Translator This chapter contains introductory material to Chapters 3 to 8  To create.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Syntax and Semantics Structure of programming languages.
LEX and YACC work as a team
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Syntax and Backus Naur Form
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
PART I: overview material
CSE 3302 Programming Languages Chengkai Li, Weimin He Spring 2008 Syntax (cont.) Lecture 4 – Syntax (Cont.), Spring CSE3302 Programming Languages,
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Topic #2: Infix to Postfix EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CSE 425: Syntax II Context Free Grammars and BNF In context free grammars (CFGs), structures are independent of the other structures surrounding them Backus-Naur.
Introduction to Parsing
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
LESSON 04.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Top-Down Parsing.
Syntax Analyzer (Parser)
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Lesson 4 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.
Introduction to Parsing
CSE 3302 Programming Languages
Programming Languages Translator
Introduction to Parsing (adapted from CS 164 at Berkeley)
4 (c) parsing.
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Introduction to Parsing
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg

2 Outline Introduction to parsing –Specifying language syntax using CFGs Ambiguous grammars

INTRODUCTION TO PARSING 3

Why use regexps and grammars? It gives a clear understanding of the language Most grammars and regexps can be used more or less directly as input to parser generators Grammars can be used to specify also the semantics (e.g., generation of code) A grammar serves as a clear and compact specification for a recursive top-down parser 4

Overview of parsing The lexical analyzer (or scanner or tokenizer) splits the input into tokens  Token = type + attribute  Examples:,,  This is done by determining membership of strings in regular languages 5

Overview of parsing The parser uses the tokens as terminals to build a parse tree  Implicitly or explicitly  Most often, the parser repeatedly “asks” the scanner for the next token 6

Overview of parsing The parser tries to determine which grammar rules to apply to build the parse tree  No suitable rules found = syntax error Two main strategies: top-down or bottom-up  Top-down parsing starts with the start symbol, i.e., the root of the parse tree  Bottom-up parsing starts with the terminals, i.e., the leaves of the parse tree 7

Examples of grammars Lists of space-separated digits like Possible solution, assuming non-empty lists: digit_list→ digit digit_list | digit Note: digit is a terminal: the name of a token, of which the actual integer value is an attribute The spaces are assumed to have been removed in the lexical analysis; therefore they are not present in the grammar 8

Examples of grammars Simple expressions, e.g., id + id + id id + id E → E + id E → id Note: here '+' is a token (terminal) as well as id 9

Examples of grammars Grammar for a “begin-end” block in the Pascal language: block→ begin stmt_list end stmt_list→ stmt_list ; stmt | stmt stmt→ assign | if … … (more statement types) 10

Exercise (1) Write a grammar for the language that allows declarations of a single integer array with initialization in C. The list is not allowed to be empty. Example: int arr[2] = {1, 2, 42}; Note: don't care about matching the number of elements in the initialization with the array size. What are suitable tokens? What change is needed in order to allow the initialization list to be empty? 11

Top-down parsing Also called predictive parsing Works as this:  Creates the root of the parse tree  Repeatedly expands non-terminal nodes in the parse tree, i.e., adding children to them, until the tree is finished, or the parser gets stuck (syntax error)  What grammar rules to apply is predicted by looking at the input In lab 1 you will implement a variant known as recursive descent 12

Recursive descent – example Grammar: S → num C C →, S C → ; Example strings: 3; 5, 7, 9; 1, 2, 3, 4, 5; 13

Recursive descent – example int main(void) { // 1 = OK // 0 = syntax error return ExpectS(); } int ExpectS() { if (Lookahead()==NUM) { Consume(); return ExpectC(); } else return 0; } int ExpectC() { switch (Lookahead()) { case COMMA: Consume(); return ExpectS(); case SEMICOLON: Consume(); return 1; default: return 0; } } 14

Using the recursive descent technique The previous parser merely determines whether or not the input program is correct However, by inserting semantical actions (code segments) into the parser, a syntax- directed translation can be performed during the parse We will look at this later 15

AMBIGUOUS GRAMMARS 16

Writing parsers from context-free grammars Different grammars may describe the same language. Example: S → e S | e and S → S e | e describe the same language, a non-empty sequence of e's The preferred form of the grammar depends on the parsing strategy used 17

Ambiguous grammars A grammar is ambiguous if it is possible to build more than one parse tree for a produced string  It is still a valid grammar for the language This might make it hard to use the grammar to write a parser  The grammar doesn't guide the parsing algorithm in making decisions 18

Exercise (2) Show that the following grammar is ambiguous, by building two different parse trees for some string produced by the grammar expr → expr + expr | expr – expr | num 19

Handling ambiguity Ignore it –Bad for the semantical analysis Rewrite the grammar Handle it carefully in the parser Explicit directives to the parser generator Which parse tree is preferred? 20

Rewriting the expression grammar The grammar can be rewritten to an unambiguous form, and still describe the same language However, preferably the (unique) parse trees should reflect the order in which the operators (+ and -) are applied  Application order is specified by operator associativity and operator precedence (described later) 21

Operator associativity Binary operators are often left-associative, e.g., +, -, *, and / This means that if an operand is surrounded by two operators of the same type, the left operator should be applied before the right one Examples: =(3 - 7) - 9 a - (b + c) - d=(a - (b + c)) - d 22

Rewriting the expression grammar We rewrite the ambiguous grammar expr→ expr + expr | expr – expr | num as expr→ expr + num | expr – num | num Both grammars describe the exact same language, but the latter one unambiguously and also reflecting the left associativity 23

Rewriting the expression grammar In this particular case the ambiguity could be resolved by using operator associativity In general we do not aim to express semantics with the grammar There is no general method for rewriting ambiguous grammars to unambiguous ones 24

Operator precedence In addition to associativity, operators have a precedence level Example: * and / have higher precedence than + and -. This means that a + b * c=a + (b * c) although both + and * are left-associative Operators with higher precedence are always applied before those with lower precedence The application order for operators within the same precedence group is given by their associativity 25

Operator precedence in C OperatorAssociativity * /Left + -Left >=Left == !=Left 26

Exercise (3) The previous grammar contained only + and -, which have the same precedence. Let's add * and / to the grammar as well: expr→ expr + num | expr – num | expr * num | expr / num | num Rewrite this grammar to reflect the operator precedence (it is already unambiguous, and the associativity is already reflected) Tip: operators on the same precedence level can be handled identically 27

“Dangling-else” Grammar for if-else statements: stmt→ if ( expr ) stmt else stmt | if ( expr ) stmt | other Problematic program: if (expr) if (expr) other else other 28

Conclusion The parser builds a parse tree (or syntax tree), either explicitly or implicitly, by grouping tokens provided by the scanner using productions of the grammar There can be several grammars for the same language Ambiguous grammars can sometimes be rewritten as unambiguous grammars 29

Next time Recursive descent parsers Left recursion Left factoring 30