Fall Compiler Principles Context-free Grammars Refresher

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
ISBN Chapter 3 Describing Syntax and Semantics.
CS5371 Theory of Computation
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 81 Grammars and ambiguity CS164 3:30-5:00 TT 10 Evans.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
COP4020 Programming Languages
EECS 6083 Intro to Parsing Context Free Grammars
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
Syntax Analysis The recognition problem: given a grammar G and a string w, is w  L(G)? The parsing problem: if G is a grammar and w  L(G), how can w.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Chapter 5 Context-Free Grammars
Grammars CPSC 5135.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
Syntax Analyzer (Parser)
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Introduction to Parsing
5. Context-Free Grammars and Languages
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
CONTEXT-FREE LANGUAGES
lec02-parserCFG May 8, 2018 Syntax Analyzer
Parsing & Context-Free Grammars
Context-Free Grammars: an overview
CS 404 Introduction to Compiler Design
Programming Languages Translator
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Parsing IV Bottom-up Parsing
Chapter 3 – Describing Syntax
Context-Free Languages
Compiler Design 4. Language Grammars
(Slides copied liberally from Ruth Anderson, Hal Perkins and others)
Relationship to Left- and Rightmost Derivations
5. Context-Free Grammars and Languages
Lecture 7: Introduction to Parsing (Syntax Analysis)
CHAPTER 2 Context-Free Languages
CSC 4181Compiler Construction Context-Free Grammars
R.Rajkumar Asst.Professor CSE
Parsing IV Bottom-up Parsing
Introduction to Parsing
Introduction to Parsing
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CSC 4181 Compiler Construction Context-Free Grammars
Theory of Computation Lecture #
Parsing & Context-Free Grammars Hal Perkins Summer 2004
lec02-parserCFG May 27, 2019 Syntax Analyzer
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
COMPILER CONSTRUCTION
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.
Presentation transcript:

Fall 2017-2018 Compiler Principles Context-free Grammars Refresher Roman Manevich Ben-Gurion University of the Negev

Example grammar S  S ; S S  id := E S  print (L) E  id E  num E  E + E L  E L  L, E shorthand for Statement shorthand for Expression shorthand for List (of expressions)

CFG terminology S  S ; S S  id := E S  print (L) E  id E  num E  E + E L  E L  L, E Symbols: Terminals (tokens): ; := ( ) id num print Non-terminals: S E L Start non-terminal: S Convention: the non-terminal appearing in the first derivation rule Grammar productions (rules) N  α

More definitions Sentential form: a sequence of symbols, terminals (tokens) and non-terminals Sentence: a sequence of terminals (tokens) Derivation step: given a sentential form αNβ and rule N  µ a step is the transition αNβ  αµβ Derivation sequence: a sequence of derivation steps 1 …  k such that i  i+1 is the result of applying one production and k is a sentence

Language of a CFG A word ω is in L(G) (valid program) if there exists a corresponding derivation sequence Start the start symbol Repeatedly replace one of the non-terminals by a right-hand side of a production Stop when the sentence contains only terminals ω is in L(G) if S * ω Leftmost derivation Rightmost derivation

Leftmost derivation S  S ; S S  id := E S  print (L) E  id E  num a := 56 ; b := 7 + 3 S S  S ; S S  id := E S  print (L) E  id E  num E  E + E L  E L  L, E 1 => S ; S 2 => id := E ; S 3 => id := num ; S 4 5 => id := num ; id := E 6 => id := num ; id := E + E 7 => id := num ; id := num + E 8 => id := num ; id := num + num id := num ; id := num + num

Rightmost derivation S  S ; S S  id := E S  print (L) E  id a := 56 ; b := 7 + 3 S S  S ; S S  id := E S  print (L) E  id E  num E  E + E L  E L  L, E 1 => S ; S 2 => S ; id := E 3 => S ; id := E + E 4 5 => S ; id := E + num 6 => S ; id := num + num 7 => id := E ; id := num + num 8 => id := num ; id := num + num id := num ; id := num + num

Canonical derivations Leftmost/rightmost derivations may not be unique but they allow describing a derivation by the sequence of production rules taken (since non-terminal is already known) Leftmost derivation example: 1, 2, 5, 2, 6, 5, 5 Rightmost derivation example: 1, 2, 6, 5, 5, 2, 5

Parse trees Tree nodes are symbols, children ordered left-to-right Each internal node is non-terminal and its children correspond to one of its productions N  µ1 … µk Root is start non-terminal Leaves are tokens Yield of parse tree: left-to-right walk over leaves N µ1 … µk

Parse tree exercise S  S ; S S  id := E S  print (L) E  id E  num L  L, E Draw parse tree for expression id := num ; id := num + num

Parse tree exercise S  S ; S S  id := E S  print (L) E  id E  num Order-independent representation S S  S ; S S  id := E S  print (L) E  id E  num E  E + E L  E L  L, E S S E E E E id := num ; id := num + num Equivalently add parentheses labeled by non-terminal names (S(Sa := (E56)E)S ; (Sb := (E(E7)E + (E3)E)E)S)S

Capabilities and limitations of CFGs CFGs naturally express Hierarchical structure A program is a list of classes, A Class is a list of definition… Alternatives A definition is either a field definition or a method definition Beginning-end type of constraints Balanced parentheses S  (S)S | ε Cannot express Correlations between unbounded strings (identifiers) For example: variables are declared before use: ω S ω Handled by semantic analysis (attribute grammars) p. 173

Badly-formed grammars By Oren neu dag (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

Badly-formed grammars A non-terminal N is reachable if S * αNβ A non-terminal N is generating if N * ω A grammar G is badly-formed if it either contains unreachable non-terminals or non-generating non-terminals G1 = S  x N  y G2 = S  x | N N  a N b N Exercise: algorithm to test whether a grammar is badly-formed Theorem: for every grammar G there exists an equivalent well-formed grammar G’ ( that is, L(G)=L(G’) ) Proof: exercise From now on, we will only handle well-formed grammars

Ambiguity in Context-free grammars

Sometimes there are two parse trees Arithmetic expressions: E  id E  num E  E + E E  E * E E  ( E ) 1 + 2 + 3 1 + (2 + 3) (1 + 2) + 3 E E E E E E E E E E num(1) + num(2) + num(3) num(1) + num(2) + num(3) Leftmost derivation E E + E num + E num + E + E num + num + E num + num + num Rightmost derivation E E + E E + num E + E + num E + num + num num + num + num

Is ambiguity a problem for compilers? Arithmetic expressions: E  id E  num E  E + E E  E * E E  ( E ) 1 + 2 + 3 1 + (2 + 3) (1 + 2) + 3 Depends on semantics E E E E E E E E E E num(1) + num(2) + num(3) = 6 num(1) + num(2) + num(3) = 6 Leftmost derivation E E + E num + E num + E + E num + num + E num + num + num Rightmost derivation E E + E E + num E + E + num E + num + num num + num + num

Problematic ambiguity example Arithmetic expressions: E  id E  num E  E + E E  E * E E  ( E ) 1 + 2 * 3 1 + (2 * 3) (1 + 2) * 3 This is what we usually want: * has precedence over + E E E E E E E E E E num(1) + num(2) * num(3) = 7 num(1) + num(2) * num(3) = 9 Leftmost derivation E E + E num + E num + E * E num + num * E num + num * num Rightmost derivation E E * E E * num E + E * num E + num * num num + num * num

Ambiguous grammars A grammar is ambiguous if there exists a word for which there are Two different leftmost derivations Two different rightmost derivations Two different parse trees Property of grammars, not languages

Facts about ambiguous grammars Some languages are inherently ambiguous – no unambiguous grammars exist [Parikh 1961] Checking whether an arbitrary grammar is ambiguous is undecidable [Hopcroft, Motwani, Ullman, 2001]