Context-free grammars. Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers.

Slides:



Advertisements
Similar presentations
Parsing II : Top-down Parsing
Advertisements

C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
ICE1341 Programming Languages Spring 2005 Lecture #4 Lecture #4 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Context-Free Grammars Lecture 7
Prof. Bodik CS 164 Lecture 61 Building a Parser II CS164 3:30-5:00 TT 10 Evans.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #5 Introduction.
Chapter 2 A Simple Compiler
Compiler Construction CS 606 Sohail Aslam Lecture 2.
Parsing II : Top-down Parsing Lecture 7 CS 4318/5331 Apan Qasem Texas State University Spring 2015 *some slides adopted from Cooper and Torczon.
CPSC Compiler Tutorial 3 Parser. Parsing The syntax of most programming languages can be specified by a Context-free Grammar (CGF) Parsing: Given.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1.
EECS 6083 Intro to Parsing Context Free Grammars
8/19/2015© Hal Perkins & UW CSEC-1 CSE P 501 – Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008.
Lee CSCE 314 TAMU 1 CSCE 314 Programming Languages Syntactic Analysis Dr. Hyunyoung Lee.
Compiler Construction 1. Objectives Given a context-free grammar, G, and the grammar- independent functions for a recursive-descent parser, complete the.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Introduction to Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
A sentence (S) is composed of a noun phrase (NP) and a verb phrase (VP). A noun phrase may be composed of a determiner (D/DET) and a noun (N). A noun phrase.
Grammars CPSC 5135.
PART I: overview material
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
Introduction to Parsing
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
Syntax Analyzer (Parser)
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Context-free grammars. Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
1 Topic #4: Syntactic Analysis (Parsing) CSC 338 – Compiler Design and implementation Dr. Mohamed Ben Othman ( )
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 will be out this evening Due Monday, 2/8 Submit in HW Server AND at start of class on 2/8 A review.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Introduction to Parsing
Chapter 3 – Describing Syntax
A Simple Syntax-Directed Translator
Parsing & Context-Free Grammars
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Syntax Specification and Analysis
What does it mean? Notes from Robert Sebesta Programming Languages
Context-free grammars (CFGs)
Programming Language Syntax 2
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

Context-free grammars

Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers

RegExs Are Great! Perfect for tokenizing a language They do have some limitations – Limited class of language that cannot specify all programming constructs we need – No notion of structure Let’s explore both of these issues

Limitations of RegExs Cannot handle “matching” – Eg: language of balanced parentheses L = { ( x ) x where x > 1} cannot be matched – Intuition: An FSM can only handle a finite depth of parentheses that we can handle let’s see a diagram…

Limitations of RegExs: Balanced Parens Assume F is an FSM that recognized L. Let N be the number of states in F’. Feed N+1 left parens into N By the pigeonhole principle, we must have revisited some state s on two input characters i and j. By the definition of F, there must be a path from s to a final state. But this means that it accepts some suffix of closed parens at input i and j, but both cannot be correct ( ( ( ( …

Limitations of RegEx: Structure Our Enhanced-RegEx scanner can emit a stream of tokens: X = Y + Z … but this doesn’t really enforce any order of operations IDASSIGN IDPLUSID

The Chomsky Hierarchy Regular Context-Free Context-Sensitive Recursively enumerable power efficiency LANGUAGE CLASS: FSM Turing machine Happy medium? Noam Chomsky

Context Free Grammars (CFGs) A set of (recursive) rewriting rules to generate patterns of strings Can envision a “parse tree” that keeps structure

CFG: Intuition S → ( S ) A rule that says that you can rewrite S to be an S surrounded by a single set of parens S S S() CFGs recognize the language of trees where all the leaves are terminals After applying ruleBefore applying rule

Context Free Grammars (CFGs)

Placeholder / interior nodes in the parse tree

Context Free Grammars (CFGs) Tokens from scanner Placeholder / interior nodes in the parse tree

Context Free Grammars (CFGs) Tokens from scanner Placeholder / interior nodes in the parse tree Rules for deriving strings

Context Free Grammars (CFGs) Tokens from scanner Placeholder / interior nodes in the parse tree Rules for deriving strings If not otherwise specified, use the non-terminal that appears on the LHS of the first production as the start

Production Syntax Expression: Sequence of terminals and nonterminals LHS → RHS Single nonterminal symbol

Production Shorthand Nonterm → expression Nonterm→ ε equivalently: Nonterm → expression | ε equivalently: Nonterm → expression | ε Sequence of terms and nonterms

Derivations To derive a string: – Start by setting “Current Sequence” to the start symbol – Repeat: Find a Nonterminal X in the Current Sequence Find a production of the form X→α “Apply” the production: create a new “current sequence” in which α replaces X – Stop when there are no more nonterminals

Derivation Syntax

An Example Grammar

Terminals begin end semicolon assign id plus

An Example Grammar Terminals begin end semicolon assign id plus For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Program boundary For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name Represents “+“ expression For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree For readability, bold and lowercase For readability, Italics and UpperCamelCase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements For readability, bold and lowercase For readability, Italics and UpperCamelCase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement For readability, bold and lowercase For readability, Italics and UpperCamelCase

An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement A mathematical expression For readability, bold and lowercase For readability, Italics and UpperCamelCase

Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase Defines the syntax of legal programs

Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name Represents “+“ expression Root of the parse tree List of statements A single statement A mathematical expression Defines the syntax of legal programs For readability, bold and lowercase For readability, Italics and UpperCamelCase

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id

Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id

Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Parse Tree

Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Parse Tree Key terminal Nonterminal Rule used

Derivation Sequence Prog Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Prog Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Prog Parse Tree 1 Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id endbeginStmts Prog Parse Tree 1 Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id StmtStmts endbeginStmts Prog semicolon Parse Tree 1 2 Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign idExprassign Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign idExprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Exprplusid Exprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used

Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Exprplusid Exprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree Key terminal Nonterminal Rule used id

Makefiles: Motivation Typing the series of commands to generate our code can be tedious – Multiple steps that depend on each other – Somewhat complicated commands – May not need to rebuild everything Makefiles solve these issues – Record a series of commands in a script-like DSL – Specify dependency rules and Make generates the results

Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab)

Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab)

Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab) Example.class depends on example.java and IO.class

Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab) Example.class depends on example.java and IO.class Example.class is generated by javac Example.java

Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java

Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java Internal Dependency graph

Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java Internal Dependency graph A file is rebuilt if one of it’s dependencies changes

Makefiles: Variables You can thread common configuration values through your makefile

Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g

Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Build for debug

Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Example.class: Example.java IO.class $(JC) $(JFLAGS) Example.java IO.class: IO.java $(JC) $(JFLAGS) IO.java Build for debug

Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

Recap We’ve defined context-free grammars – More powerful than regular grammars Submit P1 P2 will come out tonight Next time we’ll look at grammars in more detail