5. Context-Free Grammars and Languages

Slides:



Advertisements
Similar presentations
Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Introduction to Computability Theory Lecture5: Context Free Languages Prof. Amos Israeli.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
COP4020 Programming Languages
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
1 Context-Free Languages. 2 Regular Languages 3 Context-Free Languages.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2010.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
CS 3240: Languages and Computation Context-Free Languages.
ISBN Chapter 3 Describing Syntax and Semantics.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
CONTEXT FREE GRAMMAR presented by Mahender reddy.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
作者 : 陳鍾誠 單位 : 金門技術學院資管系 URL : 日期 : 2016/6/4 程式語言的語法 Grammar.
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Syntax Analysis – Part I EECS 483 – Lecture 4 University of Michigan Monday, September 17, 2006.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
1 Context-Free Languages & Grammars (CFLs & CFGs) Reading: Chapter 5.
Introduction to Parsing
6. Pushdown Automata CIS Automata and Formal Languages – Pei Wang.
Context-Free Languages & Grammars (CFLs & CFGs) (part 2)
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
CONTEXT-FREE LANGUAGES
Context-Free Languages & Grammars (CFLs & CFGs)
Context-Free Grammars: an overview
Formal Language & Automata Theory
CS 404 Introduction to Compiler Design
G. Pullaiah College of Engineering and Technology
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Syntax Specification and Analysis
Context free grammar.
CIS Automata and Formal Languages – Pei Wang
PARSE TREES.
Context Free Languages
Syntax Analysis Sections :.
Context-Free Languages
Programming Language Syntax 2
Context-Free Grammars
5. Context-Free Grammars and Languages
CHAPTER 2 Context-Free Languages
R.Rajkumar Asst.Professor CSE
Context-Free Grammars 1
Finite Automata and Formal Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Derivations and Languages
Theory of Computation Lecture #
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Fall Compiler Principles Context-free Grammars Refresher
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

5. Context-Free Grammars and Languages CIS 5513 - Automata and Formal Languages – Pei Wang

Languages and grammars Regular expression: constants and operators Grammar: variables and rewriting rules Difference: whether to give a pattern a name Example: Binary palindromes do not form a regular language, but can be specified as P → ɛ | 0 | 1 | 0P0 | 1P1 where ‘P’ is a variable, ‘→’ the production symbol, and ‘|’ for alternatives

Context-free grammar A Context-Free Grammar (CFG) G is defined as G = (V, T, P, S): V: the set of variables (non-terminals, syntactic categories, each as a language) T: the set of terminal symbols (alphabet) P: the set of productions (rules) that each has a variable (head) and a string (body) S: the start symbol (as the whole language)

Example of CFG A simple arithmetic expression consists of identifiers connected by ‘+’ and ‘*’ operators E → I | E + E | E * E | (E) I → a | b | Ia | Ib | I0 | I1 The rules are defined individually, without ‘|’ In E → E + E, the three E’s represent different strings The star operator can be achieved by recursion

Derivation using a CFG A CFG defines a language that consists of the strings of terminals derived from the start symbol using the production rules Derivation: from the start symbol to the terminals Recursive inference: from the terminals to the start symbol

Example of recursive inference

Example of derivation Here ‘’ means “derive in one step”. With a ‘*’ above, it means “derive in any number of steps”; With a ‘G’ below, it means “derive by grammar G”

Leftmost/rightmost derivation Leftmost/rightmost derivation restrict the selection of variable to be derived

Context-free language L(G) is called a context-free language (CFL) since G is a context-free grammar A string derived from S is a “sentential form”, which can be “left” (or “right”) if formed by an leftmost (or rightmost) derivation

CFG and regular language A CFG specifies a regular language if it is in one of the following two forms: Right-linear: if all of its rules have the form of P → ε, P → a, or P → aQ Left-linear: if all of its rules have the form of P → ε, P → a, or P → Qa The former maps to an ε-NFA, while the latter to the reverse of the former

5.1.1(a): define the CFG of { 0n1n | n  1 } Exercises for Section 5.1 5.1.1(a): define the CFG of { 0n1n | n  1 } 5.1.1(b): define the CFG of { aibjck | i ≠ j or j ≠ k } Solutions: http://infolab.stanford.edu/~ullman/ialcsols/sol5.html#sol51

Parse trees A derivation can be expressed as a parsing tree

Equivalent statements about CFG The sequence of leaves of a parse tree, from left to right, is the yield of the tree, which is the terminal string derived from the start symbol

Parsers Parsing or syntactic analysis is the process of analyzing a string of symbols according to the rules of a formal grammar A parser is a program that generates parse trees from input strings according to a given grammar In UNIX, the YACC command takes a CFG as input, and the output is a fragment of C code that can generate a parse tree

Ambiguity in CFG A CFG is “ambiguous” if there is a string as the yield of different parse trees For example, the grammar of arithmetic expressions allow E + E * E to be parsed in two ways, for the different orders of the two operators The mere existence of different derivations does not imply ambiguity

Removing ambiguity There is no algorithm that can decide whether an arbitrary CFG is ambiguous, nor to remove all ambiguity Some ambiguity can be removed by revising the CFG, such as separating the order of + and * in expressions:

Unique derivation In an unambiguous grammar, leftmost derivations are unique, and so are rightmost derivations Therefore though a variable can have more than one production rule, only one can be applied in each situation For a given CFG, a string has two distinct parse trees if and only if it has two distinct leftmost derivations from the start symbol

Inherent ambiguity A CFL is “inherently ambiguous” if all its grammars are ambiguous Example: L = {anbncmdm}  {anbmcmdn} where m and n are positive integers It is easy to get a CFG that recognizes the two types of strings separately, but it will given the string “aabbccdd” two leftmost derivations, as well as two parse trees

Inherent ambiguity: example

Exercises for Section 5.4 Exercise 5.4.3: Find an unambiguous grammar for the above language Solutions: http://infolab.stanford.edu/~ullman/ialcsols/sol5.html#sol54

Applications of CFG Examples: Mathematical language Logical language Markup language Programming language Natural language