Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
ISBN Chapter 3 Describing Syntax and Semantics.
CS5371 Theory of Computation
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Discussion #31/20 Discussion #3 Grammar Formalization & Parse-Tree Construction.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Specifying Languages CS 480/680 – Comparative Languages.
COP4020 Programming Languages
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
CSCI 2670 Introduction to Theory of Computing September 21, 2004.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Context-Free Grammars
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
Grammars CPSC 5135.
PART I: overview material
LANGUAGE DESCRIPTION: SYNTACTIC STRUCTURE
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 3 Describing Syntax and Semantics
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 3240 – Chapter 5. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Introduction Finite Automata accept all regular languages and only regular languages Even very simple languages are non regular (  = {a,b}): - {a n b.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
CHAPTER 4 CFG & SYNTACTIC ANALYSIS SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Formal Language & Automata Theory
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Context-Free Grammars
Compiler Construction
Compiler Construction (CS-636)
Department of Software & Media Technology
Context-Free Grammars
Context-Free Grammars
Context-Free Grammars
Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim

Outline Context-free grammar —Specify syntactic structure of the programming languages —Efficient and well-defined algorithm Context-free grammar’s features Grammar conversion Push-down Automata (2011-1) Compiler2

1. Introduction (1) Token structure  regular expression (regular grammar) Structure of the programming languages  context-free grammar —Simple and easy to understand —Automatically implement the recognizer from the grammar —Easy to translate (2011-1) Compiler3

1. Introduction (2) type 2CFG form : N. Chomsky 의 type 2 grammar Notational convention —Terminal symbol –Lower characters (a, b, c, …) and digits (0, 1, …, 9) –Operator symbols (+, -), comma, semi-colon, parenthesis, … –Symbols enclosed by ‘ ’ (‘if’, ‘then’) (2011-1) Compiler4 A  , where A  V N and   V *

1. Introduction (3) —Nonterminal symbol –Upper characters (A, B, C, …) –Start symbol: S –Symbols enclosed by (,, …) —If no comment, left nonterminal of the first production is the start symbol —Alternation production: A   1, A   2 A   1 |  2 (2011-1) Compiler5

1. Introduction (4) Other symbols —X, Y, Z: terminal or nonterminal (X, Y, Z  V) —u, v, z, : string composed of terminals string (  V T *) —, , : string composed of grammar symbol (, ,   V*) (2011-1) Compiler6

1. Introduction (5) 예 (2011-1) Compiler7 E  E OP E | (E) | -E | id OP   |  |  | / | ↑ V N =  E, OP  V T =  (, ), , id, , , /, ↑   'if' 'then' V N : symbol enclosed by V T : symbol enclosed by ‘ ’

2.1 Derivation (1) Derivation:  1   2 —Process from the the start symbol to the string Definition5.1 (2011-1) Compiler8 Leftmost derivation substitute left-most nonterminal  left-sentential form Rightmost derivation substitute right-most nonterminal  right-sentential form

2.1 Derivation (2) Example 4 —Leftmost derivation: E  (E)  (E+E)  (a+E)  (a+a) —Rightmost derivation: E  (E)  (E+E)  (E+a)  (a+a) (2011-1) Compiler9 E  E + E | E * E | (E) | a

2.1 Derivation (3) Definition 5.2 (2011-1) Compiler10 Left parse order of production applied in the leftmost derivation top-down parsing Right parse order of production applied in the rightmost derivation bottom-up parsing

2.1 Derivation (4) Example 5: (a+a)*a —Left parse: —Right parse: (2011-1) Compiler11 1.E  E + E 2.E  E * E 3.E  (E) 4.E  a

2.2 Derivation Tree (1) Derivation tree —Represent the steps of the sentence derivation —Root, interior, terminal, leaf —Show the hierarchical structure of the sentence (2011-1) Compiler12

2.2 Derivation Tree (2) Definition 5.3 (2011-1) Compiler13 Derivation tree for context-free grammar G = {V T, V N, P, S} 1.Root node: S 2.Interior node: nonterminal symbol 3.Terminal node: terminal symbol or  4.If A  A 1 A 2 …A k exists nodes A 1, A 2, …, A k become children of A

2.2 Derivation Tree (3) Derivation tree (ordered tree) —A  X Y Z Example 6: left-most derivation for (a + a) (2011-1) Compiler14 A X Y Z A X Y Z E ( E ) E ( E ) E + E E ( E ) E + E a

2.2 Derivation Tree (4) Ambiguous tree: a+a*a (2011-1) Compiler15 E E + E a E * E a a E E * E E + E a a

2.3 Ambiguity (1) Definition 5.4 Example 7: if b then if b then a else a (2011-1) Compiler16 If a sentence generated by G has more than two derivation trees, grammar G is ambiguous. S if C then S else S b if C then S a b a S if C then S b if C then S else S b a a

2.3 Ambiguity (2) Deterministic parsing: unambiguous grammar Ambiguous  non-ambiguous —Introduce a new nonterminal —Apply precedence & associativity rule (2011-1) Compiler17 (O) ambiguous nondeterministic (X)

2.3 Ambiguity (3) Example —Operator precedence: + < * —Left association —steps –The most basic operand F (factor): F  (E) | a –Introduce T (term) for F which has *: T  T * F | F –Expression E composed of + (2011-1) Compiler18 E  E + E | E * E | (E) | a

(2011-1) Compiler19 E E + T T T * F F F a a a E  E + T | T T  T * F | F F  (E) | a

2.3 Ambiguity (4) Ambiguous productions —Production: A  AA —Sentential form: AAA —2 trees (2011-1) Compiler20 A A  A A A  A

3. Grammar Conversion (1) Grammar conversion —For efficient syntactic analysis —Substitution, expansion Definition 5.6 (2011-1) Compiler21 If L(G 1 ) = L(G 2 ), grammar G 1 and G 2 are equivalent

3. Grammar Conversion (2) Substitution —Remove specific production —add corresponding production (2011-1) Compiler22 A   B , B  V N, ,   V* B   1 |… |  n A   1  |  2  | … |  n 

3. Grammar Conversion (3) Example 10 —Remove S  aT —Add S  aS | aSb |ac (2011-1) Compiler23 P = { S  aT | bT, T  S | Sb | c } P’ = { S  aS | aSb | ac | bT, T  S | Sb | c }

3. Grammar Conversion (4) Expansion —Split a production by introducing a new nonterminal symbol (2011-1) Compiler24 A   A   X, X   A  X , X   or

3.1 Remove Useless Production (1) Useless production —Non-applicable production for sentence generation  remove —Non-terminating nonterminal symbol —Inaccessible symbol (2011-1) Compiler25

3.1 Remove Useless Production (2) Definition 5.7 Definition 5.8 (2011-1) Compiler26 If there is no derivation like S  * uXv  * ,   V T * X is useless symbol - terminating nonterminal: A  ,   *  and   V T * - accessible symbol: X when S  *  1 X  2,  1,  2  V T *

3.1 Remove Useless Production (3) Removal methods —Remove productions with non-terminating nonterminal —Remove productions with Inaccessible symbol Algorithm for terminating nonterminal (2011-1) Compiler27 Algorithm terminating; begin V N ’ := {A | A    P,   V T *}; repeat V N ’ := V N ’  {A | A    P,   (V N ’  V T )*} until no change end.

3.1 Remove Useless Production (4) Example 11: P = {S  A, S  B, B  a} —V N ’ = {B} V N ’ = {B, S} —V N - V N ’ = {A} —P’ = {S  B, B  a} (2011-1) Compiler28

3.1 Remove Useless Production (5) Algorithm for accessible symbol Example 12 (2011-1) Compiler29 Algorithm accessible; begin V’ := {S}; repeat V’ := V’  {X | some A   X   P, A  V’} until no change end. G: S  aB A  aB A  aC B  C C  b V’ = {S} V’ = {S, a, B} V’ = {S, a, B, C} V’ = {S, a, B, C, b} V – V’ = {A} P’ = {S  aB, B  C, C  b}

3.1 Remove Useless Production (6) Steps of removing useless productions (2011-1) Compiler30 Terminating Nonterminal Accessible Symbol Context free productions Useful productions

3.1 Remove Useless Production (7) Example 13: P = {S  aS, S  A, S  B, A  aA, B  a, C  aa} —Get terminating nonterminals –V N ’ = {B, C}  V N ’ = {B, C, S} –Non-terminating nonterminal = {A} –P’ = {S  aS, S  B, B  a, C  aa} —Accessible symbol –V’ = {S}  V’ = {S, a, B} –Inaccessible symbol = {C} –P’’ = {S  aS, S  B, B  a} (2011-1) Compiler31

3.2 Remove  -Production (1) Definition 5.9 (2011-1) Compiler32  -free (1)P has no  –production (2)Only S has  –production and S must not appear on the right hand side of the other productions

3.2 Remove  -Production (2) Algorithm for converting to –free grammar (2011-1) Compiler33  Algorithm  -free; begin P’ := P – {A   | A  V N }; V N  := {A | A  + , A  V N }; for A   0 B 1  1 B 2 …B k  k  P’, where    and B i  V N  do if (B- 생성 규칙이 P’ 에 존재 ) A   0 B 1  1 B 2 …B k  k 에 대하여 X i =  또는 X i =B i 의 조합에 의해 나올 수 있는 모든 생성 규칙을 P’ 에 추가 else A   0 B 1  1 B 2 …B k  k 에서 X i =  인 생성 규칙을 P’ 에 추가 end for if S  V N  then P’ := P’  {S’   |S} end.

3.2 Remove  -Production (3) —P’: set without -production —V N : set of nonterminals which can derive  Nullable nonterminal A: A  *  (2011-1) Compiler34

3.2 Remove  -Production (4) Get V N —From the production —From the derivation (2011-1) Compiler35 Algorithm Compute_ V N  ; begin V N  := {A | A    P}; repeat V N  := V N   {B | B    P,   V N  *} until no change end.

3.2 Remove  -Production (5) Example 14 —P’ = {S  aSbS | bSaS}, V N = {S} —S  aSbS: S  aSbS | abS | aSb | ab —S  bSaS: S  bSaS | baS | bSa | ba —P’ = {S  aSbS | abS | aSb | ab | bSaS | baS | bSa | ba} —S’  S | , S  aSbS | abS | aSb | ab | bSaS | baS | bSa | ba (2011-1) Compiler36 S  aSbS | bSaS | 

3.3 Remove Single Production (1) Single production —One nonterminal on the right hand side: A  B —Unnecessary derivation  slow parsing  remove (2011-1) Compiler37

3.3 Remove Single Production (2) Algorithm for removing single production (2011-1) Compiler38 Algorithm Remove_Single_Production begin P’ := P – {A  B | A, B  V N }; for each A  V N do V NA := {B | A  + B}; for each B  V NA do for each B    P’ do (* not single production *) P’ := P’  {A   } end for end.

3.3 Remove Single Production (3) Algorithm for computing V NA (2011-1) Compiler39 Algorithm Compute_V NA begin V NA := {B | A  B  P}; repeat V NA := V NA  {C | B  C  P, B  V NA } until no change end.

3.3 Remove Single Production (4) Example 15: —P’ = {E  E+T, T  T*F, F  (E), F  a} —V NE = {T, F}  P’ = {E  E+T | T*F | (E) | a, T  T*F, F  (E), F  a} —V NT = {F}  P’ = {E  E+T | T*F | (E) | a, T  T*F | (E) | a, F  (E), F  a} (2011-1) Compiler40 E  E+T | TT  T*F | FF  (E) | a

3.3 Remove Single Production (5) Definition 5.10 (2011-1) Compiler41 cycle-free For all A  V N, there is no derivation like A  * A Proper Grammar (1)cycle-free (2)  -free (3)No unnecessary symbols

3.4 Normal Form (1) Definition 5.11 (2011-1) Compiler42 Normal form Grammar (CNF: Chomsky Normal Form) (1) A  BC (A, B, C  V N ) (2) A  a (a  V T ) (3) If   L(G), then S   and S must not appear on the RHS

3.4 Normal Form (2) Context-free grammar  CNF — -free grammar —A  , production with || > 2: 2 symbols on RHS (2011-1) Compiler43 A  X 1 ’  X 2 ’ …  X k-1 ’X k ’

3.4 Normal Form (3) Example 16: —S  a’ a’  a  AB,S  BA —A  B’  BB,A  a —B  AS | b (2011-1) Compiler44 S  aAB | BA A  BBB | a B  AS | b

4. CFG Notation (1) BNF (Backus-Naur Form) —Nonterminal symbol: —Terminal symbol: ‘, ’ —  : ::= Example 17 (2011-1) Compiler45 E  E+T | T T  T*F | F F  (E) | a ::= ‘+’ | ::= ‘*’ | ::= ‘(‘ ‘)’ | ‘a’

4. CFG Notation (2) EBNF (extended BNF) —Easy to read and simple —Meta symbol: simply represent the repetitive part and alternative part (2011-1) Compiler46 { } [ ]

4. CFG Notation (3) – ::=, |  ::= {, } –Max/min # of repetition – ::= if then [else ] –BNF: ::= | ‘[‘ ‘]’ –EBNF: ::= [ ‘[‘ ‘]’ ] (2011-1) Compiler47

4. CFG Notation (4) Parenthesis and alternation: simple representation (2011-1) Compiler48 ::= + | - | * | / ::= (+|-|*|/)

4. CFG Notation (5) Syntax diagram —Show grammar by figure: easy to understand the syntactic structure —Notation –Nonterminal: rectangle –Terminal: circle, ellipse –Arc: link (2011-1) Compiler49 A a

4. CFG Notation (6) —Example –A ::= X 1 X 2 … Xn (2011-1) Compiler50... X1X1 X2X2 XnXn X1X1 X2X2 XnXn

4. CFG Notation (7) —A ::=  1 |  2 |...|  n —A ::= {} —A ::= [] (2011-1) Compiler51 A.. 11 22 A  A 

4. CFG Notation (8) —A ::= ( 1 |  2 | ) Example 22 (2011-1) Compiler52 A 11 22  A ::= a | (B) B ::= AC C ::= {+A} C B A C A B a () A +

4. CFG Notation (9) —Synthesis (2011-1) Compiler53 A A A a () +

4. CFG Notation (10) Example 24: integer variable declaration in C —Format: keyword int  variable list (comma)  semi-colon —BNF – ::= int ; – ::=, | (2011-1) Compiler54

4. CFG Notation (11) —EBNF – ::= int {, } ; —Syntax diagram (2011-1) Compiler55 int; id, int_dcl