Simplifications of Context-Free Grammars

Slides:



Advertisements
Similar presentations
Simplifications of Context-Free Grammars
Advertisements

Fall 2006Costas Buch - RPI1 Simplifications of Context-Free Grammars.
CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Closure Properties of CFL's
CYK Parser Von Carla und Cornelia Kempa. Overview Top-downBottom-up Non-directional methods Unger ParserCYK Parser.
Exercise 1: Balanced Parentheses Show that the following balanced parentheses grammar is ambiguous (by finding two parse trees for some input sequence)
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
Fall 2004COMP 3351 Simplifications of Context-Free Grammars.
Prof. Busch - LSU1 Simplifications of Context-Free Grammars.
Courtesy Costas Busch - RPI1 Positive Properties of Context-Free languages.
Courtesy Costas Buch - RPI1 Simplifications of Context-Free Grammars.
Courtesy Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
Costas Buch - RPI1 Simplifications of Context-Free Grammars.
1 Normal Forms for Context-free Grammars. 2 Chomsky Normal Form All productions have form: variable and terminal.
Costas Busch - RPI1 Context-Free Languages. Costas Busch - RPI2 Regular Languages.
1 Simplifications of Context-Free Grammars. 2 A Substitution Rule substitute B equivalent grammar.
1 Normal Forms for Context-free Grammars. 2 Chomsky Normal Form All productions have form: variable and terminal.
1 Simplifications of Context-Free Grammars. 2 A Substitution Rule Substitute Equivalent grammar.
1 Compilers. 2 Compiler Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v.
Cs466(Prasad)L8Norm1 Normal Forms Chomsky Normal Form Griebach Normal Form.
Fall 2004COMP 3351 Context-Free Languages. Fall 2004COMP 3352 Regular Languages.
1 Applications of Regular Closure. 2 The intersection of a context-free language and a regular language is a context-free language context free regular.
CS 3813 Introduction to Formal Languages and Automata Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2005.
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 Mälardalen University 2010.
Context Free Grammar. Introduction Why do we want to learn about Context Free Grammars?  Used in many parsers in compilers  Yet another compiler-compiler,
CYK Algorithm for Parsing General Context-Free Grammars
1 Lex. 2 Lex is a lexical analyzer Var = ; if (test > 20) temp = 0; else while (a < 20) temp++; Lex Ident: Var Integer: 12 Oper: + Integer: 9 Semicolumn:
Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on material from our textbook, An Introduction to Formal.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
Context-Free Languages. Regular Languages Context-Free Languages.
Costas Busch - LSU1 Parsing. Costas Busch - LSU2 Compiler Program File v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } Add v,v,5.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
lec02-parserCFG May 8, 2018 Syntax Analyzer
Closed book, closed notes
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
Context-Free Grammars: an overview
G. Pullaiah College of Engineering and Technology
CS510 Compiler Lecture 4.
Chomsky Normal Form CYK Algorithm
PDAs Accept Context-Free Languages
Parsing — Part II (Top-down parsing, left-recursion removal)
7. Properties of Context-Free Languages
Simplifications of Context-Free Grammars
PDAs Accept Context-Free Languages
NPDAs Accept Context-Free Languages
Parsing Techniques.
7. Properties of Context-Free Languages
Chapter 6 Simplification of Context-free Grammars and Normal Forms
CHAPTER 2 Context-Free Languages
R.Rajkumar Asst.Professor CSE
Deterministic PDAs - DPDAs
Parsing Costas Busch - LSU.
Closure Properties of Context-Free languages
Decidable Problems of Regular Languages
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
BNF 9-Apr-19.
… NPDAs continued.
Applications of Regular Closure
lec02-parserCFG May 27, 2019 Syntax Analyzer
Normal Forms for Context-free Grammars
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.
Context-Free Languages
Presentation transcript:

Simplifications of Context-Free Grammars

Assumption The language discussed here does not contain the empty string. Let L be any context-free language, let G = (V, T, S, P) be a context-free grammar for L – { }. Then the grammar obtain by adding to V the new variable S0, Making the start variable, and adding to P the productions: generate L.

A Substitution Rule Equivalent grammar Substitute

A Substitution Rule Substitute Equivalent grammar

In general: Substitute equivalent grammar

Nullable Variables Nullable Variable:

Removing Nullable Variables Example Grammar: Nullable variable

Why? Language generated by the grammar? (ab*)(ab*)* How many derivation steps for the string aaaa?

Language generated by the grammar? (ab*)(ab*)* How many derivation steps for the string aaaa?

Final Grammar Substitute

Algorithm to construct the set of nullable variables Input: context-free grammar G = (V, T, S, P) 1. 2. Repeat 2.1. PREV := NULL 2.2 for each variable do if there is an A rule and , then until NULL = PREV

Example Iteration NULL PREV {B, C} 1 {A, B, C} 2

Once the set of nullable variables has been found, a new set of production rules P’ is constructed. Construction: For each production in the grammar G - the production is put into P’ - as well as all those generated by replacing nullable variables with in all possible combinations - if all are nullable, the production is not added

Example Nullable variables: A, B, C New Productions:

Unit-Productions Unit Production: (a single variable in both sides)

Removing Unit Productions Observation: Is removed immediately

How about:

Example Grammar:

Substitute

Remove

Substitute

Remove repeated productions Final grammar

Algorithm to remove unit productions Draw a dependency graph with an edge (A, B) whenever the grammar has a unit production The new grammar G’ is generated by first putting into P’ all non-unit productions of P. Then for all A and B satisfying , we add to P’ Where is the set of all rules in P’ with B on the left

Another Example Add original non-unit productions: Use A->a|bc to substitute: Use B->bb to substitute:

Useless Productions Useless Production Some derivations never terminate...

Another grammar: Useless Production Not reachable from S

then variable is useful In general: contains only terminals if then variable is useful otherwise, variable is useless

A production is useless if any of its variables is useless Productions useless Variables useless

Removing Useless Productions Example Grammar:

First: find all variables that can produce strings with only terminals Round 1: Round 2:

Keep only the variables that produce terminal symbols: (the rest variables are useless) Remove useless productions

Second: Find all variables reachable from Use a Dependency Graph not

Keep only the variables reachable from S (the rest variables are useless) Final Grammar Remove useless productions

Removing All Step 1: Remove Nullable Variables Step 2: Remove Unit-Productions Step 3: Remove Useless Variables

Normal Forms for Context-free Grammars

Chomsky Normal Form Each productions has form: or variable variable terminal

Examples: Chomsky Normal Form Not Chomsky Normal Form

Conversion to Chomsky Normal Form Example: Not Chomsky Normal Form

Introduce variables for terminals:

Introduce intermediate variable:

Introduce intermediate variable:

Final grammar in Chomsky Normal Form: Initial grammar

In general: From any context-free grammar (which doesn’t produce ) not in Chomsky Normal Form we can obtain: An equivalent grammar in Chomsky Normal Form

The Procedure First remove: Nullable variables Unit productions Useless productions

Then, for every symbol : Add production In productions: replace with New variable:

Replace any production with New intermediate variables:

Theorem: For any context-free grammar (which doesn’t produce ) there is an equivalent grammar in Chomsky Normal Form

Observations Chomsky normal forms are good for parsing and proving theorems It is very easy to find the Chomsky normal form for any context-free grammar

Greinbach Normal Form All productions have form: symbol variables

Examples: Greinbach Normal Form Not Greinbach Normal Form

Conversion to Greinbach Normal Form:

Theorem: For any context-free grammar (which doesn’t produce ) there is an equivalent grammar in Greinbach Normal Form

Observations Greinbach normal forms are very good for parsing It is hard to find the Greinbach normal form of any context-free grammar

Compilers

Add v,v,0 cmp v,5 jmplt ELSE THEN: add x, 12,v ELSE: WHILE: cmp x,3 ... Machine Code Program v = 5; if (v>5) x = 12 + v; while (x !=3) { x = x - 3; v = 10; } ...... Compiler

Compiler Lexical analyzer parser input output machine code program

A parser knows the grammar of the programming language

Parser PROGRAM STMT_LIST STMT_LIST STMT; STMT_LIST | STMT; STMT EXPR | IF_STMT | WHILE_STMT | { STMT_LIST } EXPR EXPR + EXPR | EXPR - EXPR | ID IF_STMT if (EXPR) then STMT | if (EXPR) then STMT else STMT WHILE_STMT while (EXPR) do STMT

The parser finds the derivation of a particular input derivation Parser input E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 E -> E + E | E * E | INT 10 + 2 * 5

derivation tree derivation E E => E + E => E + E * E => 10 + E*E => 10 + 2 * E => 10 + 2 * 5 E + E 10 E E * 2 5

derivation tree E machine code E + E mult a, 2, 5 add b, 10, a 10 E E * 2 5

Parsing

Parser input string derivation grammar

Example: Parser derivation input ?

Exhaustive Search Phase 1: Find derivation of All possible derivations of length 1

Phase 2 Phase 1

Phase 2 Phase 3

Final result of exhaustive search (top-down parsing) Parser input derivation

Time complexity of exhaustive search Suppose there are no productions of the form Number of phases for string :

For grammar with rules Time for phase 1: possible derivations

Time for phase 2: possible derivations

Time for phase : possible derivations

Extremely bad!!! Total time needed for string : phase 1 phase 2|w|

There exist faster algorithms for specialized grammars S-grammar: symbol string of variables Pair appears once

S-grammar example: Each string has a unique derivation

For S-grammars: In the exhaustive search parsing there is only one choice in each phase Time for a phase: Total time for parsing string :

For general context-free grammars: There exists a parsing algorithm that parses a string in time (we will show it in the next class)

The CYK Parser

The CYK Membership Algorithm Input: Grammar in Chomsky Normal Form String Output: find if

The Algorithm Input example: Grammar : String :

Therefore: Time Complexity: Observation: The CYK algorithm can be easily converted to a parser (bottom up parser)