Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.

Similar presentations


Presentation on theme: "Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim."— Presentation transcript:

1 Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim

2 Outline Context-free grammar —Specify syntactic structure of the programming languages —Efficient and well-defined algorithm Context-free grammar’s features Grammar conversion Push-down Automata (2011-1) Compiler2

3 1. Introduction (1) Token structure  regular expression (regular grammar) Structure of the programming languages  context-free grammar —Simple and easy to understand —Automatically implement the recognizer from the grammar —Easy to translate (2011-1) Compiler3

4 1. Introduction (2) type 2CFG form : N. Chomsky 의 type 2 grammar Notational convention —Terminal symbol –Lower characters (a, b, c, …) and digits (0, 1, …, 9) –Operator symbols (+, -), comma, semi-colon, parenthesis, … –Symbols enclosed by ‘ ’ (‘if’, ‘then’) (2011-1) Compiler4 A  , where A  V N and   V *

5 1. Introduction (3) —Nonterminal symbol –Upper characters (A, B, C, …) –Start symbol: S –Symbols enclosed by (,, …) —If no comment, left nonterminal of the first production is the start symbol —Alternation production: A   1, A   2 A   1 |  2 (2011-1) Compiler5

6 1. Introduction (4) Other symbols —X, Y, Z: terminal or nonterminal (X, Y, Z  V) —u, v, z, : string composed of terminals string (  V T *) —, , : string composed of grammar symbol (, ,   V*) (2011-1) Compiler6

7 1. Introduction (5) 예 (2011-1) Compiler7 E  E OP E | (E) | -E | id OP   |  |  | / | ↑ V N =  E, OP  V T =  (, ), , id, , , /, ↑   'if' 'then' V N : symbol enclosed by V T : symbol enclosed by ‘ ’

8 2.1 Derivation (1) Derivation:  1   2 —Process from the the start symbol to the string Definition5.1 (2011-1) Compiler8 Leftmost derivation substitute left-most nonterminal  left-sentential form Rightmost derivation substitute right-most nonterminal  right-sentential form

9 2.1 Derivation (2) Example 4 —Leftmost derivation: E  (E)  (E+E)  (a+E)  (a+a) —Rightmost derivation: E  (E)  (E+E)  (E+a)  (a+a) (2011-1) Compiler9 E  E + E | E * E | (E) | a

10 2.1 Derivation (3) Definition 5.2 (2011-1) Compiler10 Left parse order of production applied in the leftmost derivation top-down parsing Right parse order of production applied in the rightmost derivation bottom-up parsing

11 2.1 Derivation (4) Example 5: (a+a)*a —Left parse: 2 3 1 4 4 4 —Right parse: 4 4 1 3 4 2 (2011-1) Compiler11 1.E  E + E 2.E  E * E 3.E  (E) 4.E  a

12 2.2 Derivation Tree (1) Derivation tree —Represent the steps of the sentence derivation —Root, interior, terminal, leaf —Show the hierarchical structure of the sentence (2011-1) Compiler12

13 2.2 Derivation Tree (2) Definition 5.3 (2011-1) Compiler13 Derivation tree for context-free grammar G = {V T, V N, P, S} 1.Root node: S 2.Interior node: nonterminal symbol 3.Terminal node: terminal symbol or  4.If A  A 1 A 2 …A k exists nodes A 1, A 2, …, A k become children of A

14 2.2 Derivation Tree (3) Derivation tree (ordered tree) —A  X Y Z Example 6: left-most derivation for (a + a) (2011-1) Compiler14 A X Y Z A X Y Z E ( E ) E ( E ) E + E E ( E ) E + E a

15 2.2 Derivation Tree (4) Ambiguous tree: a+a*a (2011-1) Compiler15 E E + E a E * E a a E E * E E + E a a

16 2.3 Ambiguity (1) Definition 5.4 Example 7: if b then if b then a else a (2011-1) Compiler16 If a sentence generated by G has more than two derivation trees, grammar G is ambiguous. S if C then S else S b if C then S a b a S if C then S b if C then S else S b a a

17 2.3 Ambiguity (2) Deterministic parsing: unambiguous grammar Ambiguous  non-ambiguous —Introduce a new nonterminal —Apply precedence & associativity rule (2011-1) Compiler17 (O) ambiguous nondeterministic (X)

18 2.3 Ambiguity (3) Example —Operator precedence: + < * —Left association —steps –The most basic operand F (factor): F  (E) | a –Introduce T (term) for F which has *: T  T * F | F –Expression E composed of + (2011-1) Compiler18 E  E + E | E * E | (E) | a

19 (2011-1) Compiler19 E E + T T T * F F F a a a E  E + T | T T  T * F | F F  (E) | a

20 2.3 Ambiguity (4) Ambiguous productions —Production: A  AA —Sentential form: AAA —2 trees (2011-1) Compiler20 A A  A A A  A

21 3. Grammar Conversion (1) Grammar conversion —For efficient syntactic analysis —Substitution, expansion Definition 5.6 (2011-1) Compiler21 If L(G 1 ) = L(G 2 ), grammar G 1 and G 2 are equivalent

22 3. Grammar Conversion (2) Substitution —Remove specific production —add corresponding production (2011-1) Compiler22 A   B , B  V N, ,   V* B   1 |… |  n A   1  |  2  | … |  n 

23 3. Grammar Conversion (3) Example 10 —Remove S  aT —Add S  aS | aSb |ac (2011-1) Compiler23 P = { S  aT | bT, T  S | Sb | c } P’ = { S  aS | aSb | ac | bT, T  S | Sb | c }

24 3. Grammar Conversion (4) Expansion —Split a production by introducing a new nonterminal symbol (2011-1) Compiler24 A   A   X, X   A  X , X   or

25 3.1 Remove Useless Production (1) Useless production —Non-applicable production for sentence generation  remove —Non-terminating nonterminal symbol —Inaccessible symbol (2011-1) Compiler25

26 3.1 Remove Useless Production (2) Definition 5.7 Definition 5.8 (2011-1) Compiler26 If there is no derivation like S  * uXv  * ,   V T * X is useless symbol - terminating nonterminal: A  ,   *  and   V T * - accessible symbol: X when S  *  1 X  2,  1,  2  V T *

27 3.1 Remove Useless Production (3) Removal methods —Remove productions with non-terminating nonterminal —Remove productions with Inaccessible symbol Algorithm for terminating nonterminal (2011-1) Compiler27 Algorithm terminating; begin V N ’ := {A | A    P,   V T *}; repeat V N ’ := V N ’  {A | A    P,   (V N ’  V T )*} until no change end.

28 3.1 Remove Useless Production (4) Example 11: P = {S  A, S  B, B  a} —V N ’ = {B} V N ’ = {B, S} —V N - V N ’ = {A} —P’ = {S  B, B  a} (2011-1) Compiler28

29 3.1 Remove Useless Production (5) Algorithm for accessible symbol Example 12 (2011-1) Compiler29 Algorithm accessible; begin V’ := {S}; repeat V’ := V’  {X | some A   X   P, A  V’} until no change end. G: S  aB A  aB A  aC B  C C  b V’ = {S} V’ = {S, a, B} V’ = {S, a, B, C} V’ = {S, a, B, C, b} V – V’ = {A} P’ = {S  aB, B  C, C  b}

30 3.1 Remove Useless Production (6) Steps of removing useless productions (2011-1) Compiler30 Terminating Nonterminal Accessible Symbol Context free productions Useful productions

31 3.1 Remove Useless Production (7) Example 13: P = {S  aS, S  A, S  B, A  aA, B  a, C  aa} —Get terminating nonterminals –V N ’ = {B, C}  V N ’ = {B, C, S} –Non-terminating nonterminal = {A} –P’ = {S  aS, S  B, B  a, C  aa} —Accessible symbol –V’ = {S}  V’ = {S, a, B} –Inaccessible symbol = {C} –P’’ = {S  aS, S  B, B  a} (2011-1) Compiler31

32 3.2 Remove  -Production (1) Definition 5.9 (2011-1) Compiler32  -free (1)P has no  –production (2)Only S has  –production and S must not appear on the right hand side of the other productions

33 3.2 Remove  -Production (2) Algorithm for converting to –free grammar (2011-1) Compiler33  Algorithm  -free; begin P’ := P – {A   | A  V N }; V N  := {A | A  + , A  V N }; for A   0 B 1  1 B 2 …B k  k  P’, where    and B i  V N  do if (B- 생성 규칙이 P’ 에 존재 ) A   0 B 1  1 B 2 …B k  k 에 대하여 X i =  또는 X i =B i 의 조합에 의해 나올 수 있는 모든 생성 규칙을 P’ 에 추가 else A   0 B 1  1 B 2 …B k  k 에서 X i =  인 생성 규칙을 P’ 에 추가 end for if S  V N  then P’ := P’  {S’   |S} end.

34 3.2 Remove  -Production (3) —P’: set without -production —V N : set of nonterminals which can derive  Nullable nonterminal A: A  *  (2011-1) Compiler34

35 3.2 Remove  -Production (4) Get V N —From the production —From the derivation (2011-1) Compiler35 Algorithm Compute_ V N  ; begin V N  := {A | A    P}; repeat V N  := V N   {B | B    P,   V N  *} until no change end.

36 3.2 Remove  -Production (5) Example 14 —P’ = {S  aSbS | bSaS}, V N = {S} —S  aSbS: S  aSbS | abS | aSb | ab —S  bSaS: S  bSaS | baS | bSa | ba —P’ = {S  aSbS | abS | aSb | ab | bSaS | baS | bSa | ba} —S’  S | , S  aSbS | abS | aSb | ab | bSaS | baS | bSa | ba (2011-1) Compiler36 S  aSbS | bSaS | 

37 3.3 Remove Single Production (1) Single production —One nonterminal on the right hand side: A  B —Unnecessary derivation  slow parsing  remove (2011-1) Compiler37

38 3.3 Remove Single Production (2) Algorithm for removing single production (2011-1) Compiler38 Algorithm Remove_Single_Production begin P’ := P – {A  B | A, B  V N }; for each A  V N do V NA := {B | A  + B}; for each B  V NA do for each B    P’ do (* not single production *) P’ := P’  {A   } end for end.

39 3.3 Remove Single Production (3) Algorithm for computing V NA (2011-1) Compiler39 Algorithm Compute_V NA begin V NA := {B | A  B  P}; repeat V NA := V NA  {C | B  C  P, B  V NA } until no change end.

40 3.3 Remove Single Production (4) Example 15: —P’ = {E  E+T, T  T*F, F  (E), F  a} —V NE = {T, F}  P’ = {E  E+T | T*F | (E) | a, T  T*F, F  (E), F  a} —V NT = {F}  P’ = {E  E+T | T*F | (E) | a, T  T*F | (E) | a, F  (E), F  a} (2011-1) Compiler40 E  E+T | TT  T*F | FF  (E) | a

41 3.3 Remove Single Production (5) Definition 5.10 (2011-1) Compiler41 cycle-free For all A  V N, there is no derivation like A  * A Proper Grammar (1)cycle-free (2)  -free (3)No unnecessary symbols

42 3.4 Normal Form (1) Definition 5.11 (2011-1) Compiler42 Normal form Grammar (CNF: Chomsky Normal Form) (1) A  BC (A, B, C  V N ) (2) A  a (a  V T ) (3) If   L(G), then S   and S must not appear on the RHS

43 3.4 Normal Form (2) Context-free grammar  CNF — -free grammar —A  , production with || > 2: 2 symbols on RHS (2011-1) Compiler43 A  X 1 ’  X 2 ’ …  X k-1 ’X k ’

44 3.4 Normal Form (3) Example 16: —S  a’ a’  a  AB,S  BA —A  B’  BB,A  a —B  AS | b (2011-1) Compiler44 S  aAB | BA A  BBB | a B  AS | b

45 4. CFG Notation (1) BNF (Backus-Naur Form) —Nonterminal symbol: —Terminal symbol: ‘, ’ —  : ::= Example 17 (2011-1) Compiler45 E  E+T | T T  T*F | F F  (E) | a ::= ‘+’ | ::= ‘*’ | ::= ‘(‘ ‘)’ | ‘a’

46 4. CFG Notation (2) EBNF (extended BNF) —Easy to read and simple —Meta symbol: simply represent the repetitive part and alternative part (2011-1) Compiler46 { } [ ]

47 4. CFG Notation (3) – ::=, |  ::= {, } –Max/min # of repetition – ::= if then [else ] –BNF: ::= | ‘[‘ ‘]’ –EBNF: ::= [ ‘[‘ ‘]’ ] (2011-1) Compiler47

48 4. CFG Notation (4) Parenthesis and alternation: simple representation (2011-1) Compiler48 ::= + | - | * | / ::= (+|-|*|/)

49 4. CFG Notation (5) Syntax diagram —Show grammar by figure: easy to understand the syntactic structure —Notation –Nonterminal: rectangle –Terminal: circle, ellipse –Arc: link (2011-1) Compiler49 A a

50 4. CFG Notation (6) —Example –A ::= X 1 X 2 … Xn (2011-1) Compiler50... X1X1 X2X2 XnXn X1X1 X2X2 XnXn

51 4. CFG Notation (7) —A ::=  1 |  2 |...|  n —A ::= {} —A ::= [] (2011-1) Compiler51 A.. 11 22 A  A 

52 4. CFG Notation (8) —A ::= ( 1 |  2 | ) Example 22 (2011-1) Compiler52 A 11 22  A ::= a | (B) B ::= AC C ::= {+A} C B A C A B a () A +

53 4. CFG Notation (9) —Synthesis (2011-1) Compiler53 A A A a () +

54 4. CFG Notation (10) Example 24: integer variable declaration in C —Format: keyword int  variable list (comma)  semi-colon —BNF – ::= int ; – ::=, | (2011-1) Compiler54

55 4. CFG Notation (11) —EBNF – ::= int {, } ; —Syntax diagram (2011-1) Compiler55 int; id, int_dcl


Download ppt "Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim."

Similar presentations


Ads by Google