Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
CS 536 Spring Ambiguity Lecture 8. CS 536 Spring Announcement Reading Assignment –“Context-Free Grammars” (Sections 4.1, 4.2) Programming.
COP4020 Programming Languages
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
EECS 6083 Intro to Parsing Context Free Grammars
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
CMSC 330: Organization of Programming Languages
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Context-Free Grammars
Chapter 5 Context-Free Grammars
PART I: overview material
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CMSC 330: Organization of Programming Languages
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
LESSON 04.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Re-enter Chomsky More about grammars. 2 Parse trees S  A B A  aA | a B  bB | b Consider L = { a m b n | m, n > 0 } (one/more a ’s followed by one/more.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Top-Down Parsing.
Syntax Analyzer (Parser)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
COMP 3438 – Part II - Lecture 4 Syntax Analysis I Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
1 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5.
Introduction to Parsing
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Context-Free Languages & Grammars (CFLs & CFGs) (part 2)
CS 404 Introduction to Compiler Design
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Introduction to Parsing (adapted from CS 164 at Berkeley)
Syntax Specification and Analysis
Even-Even Devise a grammar that generates strings with even number of a’s and even number of b’s.
Compiler Construction
Compiler Construction (CS-636)
Context-Free Grammars
Lecture 7: Introduction to Parsing (Syntax Analysis)
CSC 4181Compiler Construction Context-Free Grammars
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CSC 4181 Compiler Construction Context-Free Grammars
Theory of Computation Lecture #
Context-Free Grammars
Fall Compiler Principles Context-free Grammars Refresher
Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1

Outline Introduction to Parsing Context-free Grammars Pare Trees Ambiguity in Grammars 2

Parse Tree A parse tree pictorially shows how the start symbol of a grammar derives a string in the language. If nonterminal A has a production A  XYZ, then a parse tree may have an interior node labeled A with three children labeled X,Y, and Z, from left to right. 3

Parse Tree Pictorially we can show how the start symbol of a grammar derives a given string 4 A  XYZ

Parsing Is the problem of taking a string of terminals and figuring out how to derive it from the start symbol of the grammar, If it cannot be derived from the start symbol of the grammar, then reporting syntax errors within the string. 5

Derivations and parse trees We can describe a derivation using a graphical representation called parse tree: ◦ the root is labeled with the start symbol, S ◦ each internal node is labeled with a non-terminal ◦ the children of an internal node A are the right- hand side of a production A  ◦ each leaf is labeled with a terminal A parse tree has a unique leftmost and a unique rightmost derivation (however, we cannot tell which one was used by looking at the tree)

Parse Trees 7

8

9

10 Parse Trees

11 Derivation Tree Parse Trees

12 yields Derivation Tree Parse Trees

Parse Tree Example Two Consider the following grammar list  list + digit | list – digit | digit digit  0|1|2|3|4|5|6|7|8|9 Derive from these rules. ◦ list  list + digit  list - digit + digit  digit – digit + digit  9 - digit + digit  9 – 5 + digit  9 –

Parse Tree Example Two The derivation of can be represented by this parse tree. 14 The process of finding a parse tree for a given string of terminals is called parsing that string

Parsing Parse Trees At each step, we choose a non-terminal to replace. ◦ This choice can lead to different derivations. Two strategies are especially interesting: 1.Leftmost derivation: replace the leftmost non- terminal at each step 2.Rightmost derivation: replace the rightmost non-terminal at each step 15© Oscar Nierstrasz

16 Sometimes the derivation order does not matter. Same derivation tree Leftmost derivation: Rightmost derivation: Parse Trees

Outline Introduction to Parsing Context-free Grammars Derivation Trees Ambiguity in Grammars 17

Ambiguity in Grammar Consider the following grammar E  E + E | E *E | (E) | a Derive the string a+a*a from the grammar. E  E +E  a +E  a + E*E  a+a*E  a+a*a 18

Ambiguity Left-most derivation from the grammar gives this parse tree. E  E + E  a + E  a + E * E  a + a * E  a + a * a 19

Ambiguity Another left-most derivation from the grammar gives this parse tree. E  E * E  E + E * E  a + E * E  a + a * E  a + a * a 20

Ambiguity E  E + E | E *E | (E) | a for the string a + a * a has two derivation trees. 21

Ambiguity The grammar E  E + E | E *E | (E) | a is ambiguous. The string a+a*a has two parse trees. The string a+a*a has two left-most derivations

Ambiguity A grammar can have more than one parse tree generating a given string of terminals. Such a grammar is said to be ambiguous. To show that a grammar is ambiguous, we find a terminal string that is the yield of more than one parse tree. 23

Ambiguity Since a string with more than one parse tree usually has more than one meaning. We need to design unambiguous grammars for compiling applications. Use ambiguous grammars with additional rules to resolve the ambiguities. 24

Ambiguity: Example One Consider the grammar: S  AB B  b A  aA | c Show that the grammar is ambiguous. 25

Ambiguity: Example One Consider the grammar: S  AB B  b A  aA | c The grammar has a number of derivations: ◦ Arbitrary order:  S  AB  aAB  aAb  aaAb  aacb ◦ Leftmost derivation:  S  AB  aAB  aaAB  aacB  aacb ◦ Rightmost derivation:  S  AB  Ab  aAb  aaAb  aacb A B S a A a A c b 26

Ambiguity: Example Two Consider the following grammar string  string + string| string– string| string string  0|1|2|3|4|5|6|7|8|9 Show that the grammar is ambiguous. 27

Ambiguity: Example Two Consider the following grammar string  string + string| string– string| string string  0|1|2|3|4|5|6|7|8|9 Show that the grammar is ambiguous. The expression has more than one parse tree with this grammar. The two trees for correspond to the two ways of parenthesizing the expression: (9-5)+2 and 9-(5+2). 28

Ambiguity: Example Two Two parse trees for

Ambiguity: Example Three Show that the following grammar is ambiguous, by producing one string in the language that has two di ff erent parse trees: S → aS | aSbS |  Show that the grammar is ambiguous. 30

Ambiguity: Example Three 31

Ambiguity: Example Four Consider the grammar ◦ E  E * E | E + E | ( E ) | id ◦ Build a parse tree for: id * id + id * id ◦ Show that the grammar is ambiguous. 32

Ambiguity: Example Four Consider the grammar ◦ E  E * E | E + E | ( E ) | id E E+E E*EE*E id E E*E E+E E*E E E*E E*E E+E 33

Ambiguity Exercises Is this grammar ambiguous or not? a)S  +SS | -SS | a b) S  S(S)S | ε c)S  a | S + S | SS | S* | (S) 34

Why care about ambiguity? Consider the string a+a*a. Take a = 3 35

Why care about ambiguity? The string becomes 3+3*

37 Why care about ambiguity?

38 Why care about ambiguity? The correct result should be 3+3*3 = 12

Ambiguity in Grammar Ambiguity implies multiple parse trees Ambiguity is bad for programming languages. ◦ Can make parsing more difficult ◦ Can impact the semantics of the language  Different parse trees can have different semantic meanings, yield different execution results. Try to Eliminate ambiguity ◦ Sometimes, rewrite the grammar can eliminate ambiguity  But it may add additional semantics in the language 39

Ambiguity Rewrite grammar to eliminate ambiguity ◦ Many ways to rewrite the grammar  The new grammar should accept the same language ◦ For each input string, there may be multiple parse trees ◦ Each has a different semantic meaning ◦ Which one do we want? Rewrite grammar should be based on the desired semantics. There is no general algorithm to rewrite ambiguous grammars. 40

Rewrite Ambiguous Grammar Try to use a single recursive nonterminal in each rule ◦ When the left symbol appears more than once on the right side, ◦ Use additional symbols to substitute them and allow only one  Force to only allow one expansion 41

Rewrite Ambiguous Grammar Example grammar The grammar is ambiguous Change to a new unambiguous grammar 42

Rewrite Ambiguous Grammar For the string A new derivation for it is: 43

Rewrite Ambiguous Grammar A new derivation for it is: Unique derivation tree for the string a+a*a 44

Rewrite Ambiguous Grammar The grammar is unambiguous. 45

Associativity of Operators How will you evaluate this? Will ‘5’ go with the ‘-’ on the left or the one on the right? If it goes with the one on the left: (9-5)-2 we say that the operator ‘-’ is left associative. If it goes with the one on the right: 9-(5- 2) we say that the operator ‘-’ is right- associative. 46

Associativity of Operators How to express associativity in production rules? term-> term – digit digit -> 0|1|2|3|4|5|6|7|8|9 term->digit-term digit -> 0|1|2|3|4|5|6|7|8|9 47 Left associative (9-5)-2 Right associative 9-(5-2)

Precedence of Operators Associativity applies to occurrence of the same operator What if operators are different? How will you evaluate: 9-5*2 48

Precedence of Operators For the expression 9+5*2, there are two possible interpretations of it: ◦ (9+5) *2 ◦ 9+ (5*2) The associativity rules for + and * apply to occurrences of the same operator, so they do not resolve this ambiguity. 49

Precedence of Operators Therefore, * has higher precedence than + if * takes its operands before + does. In ordinary arithmetic, multiplication and division have higher precedence than addition and subtraction. So, 5 is taken by * in both 9+5*2 and 9*5+2. The expressions are equivalent to ◦ 9+ (5*2) ◦ (9*5) +2 50

Conclusion Ambiguous grammars can be a significant problem in practice, because we rely on the parse tree to capture the basic structure of a program. To avoid the problems of ambiguity, we can try to: ◦ Rewrite grammar ◦ Use “disambiguating rules” when we implement parser for grammar. Most tools allow precedence and associativity declarations to disambiguate grammars 51