Download presentation
Presentation is loading. Please wait.
1
1 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice Grammars and Parsing Grammars and Parsing LL(1) Parsing LL(1) Parsing LR Parsing LR Parsing Lex and yacc Lex and yacc Semantic Processing Semantic Processing Symbol Tables Symbol Tables Run-time Storage Organization Run-time Storage Organization Code Generation and Local Code Optimization Code Generation and Local Code Optimization Global Optimization Global Optimization
2
2 Chapter 4 Grammars and Parsing
3
3 Outline Context-Free Grammars Context-Free Grammars Errors in Context-Free Grammars Errors in Context-Free Grammars Transforming Extended BNF Grammars Transforming Extended BNF Grammars Parsers and Recognizers Parsers and Recognizers Grammar Analysis Algorithms Grammar Analysis Algorithms
4
4 Context-Free Grammars: Concepts and Notation A context-free grammar G = (V t, V n, S, P) A context-free grammar G = (V t, V n, S, P) A finite terminal vocabulary V t The token set produced by scanner The token set produced by scanner A finite set of nonterminal vacabulary V n Intermediate symbols Intermediate symbols A start symbol S Vn that starts all derivations Also called goal symbol Also called goal symbol P, a finite set of productions (rewriting rules) of the form A X 1 X 2 X m A V n, X i V n ∪ V t, 1 i m A V n, X i V n ∪ V t, 1 i m A is a valid production A is a valid production
5
5 Context-Free Grammars: Concepts and Notation (Cont’d.) Other notations Other notations The vocabulary V of a CFG is the set of terminal and nonterminal symbols V= V n ∪ V t V= V n ∪ V t L(G), the set of strings derivable from S comprise Context-free language of grammar G Context-free language of grammar G Notational conventions a, b, c, denote symbols in V t a, b, c, denote symbols in V t A, B, C, denote symbols in V n A, B, C, denote symbols in V n U, V, W, denote symbols in V U, V, W, denote symbols in V , , , denote strings in V* , , , denote strings in V* u, v, w, denote strings in V t * u, v, w, denote strings in V t *
6
6 Context-Free Grammars: Concepts and Notation (Cont’d.) Derivation Derivation One step derivation If A , then A If A , then A One-step derivation One or more steps derivation Zero or more steps derivation If S , then is said to be sentential form of the CFG. If S , then is said to be sentential form of the CFG. SF(G) is the set of sentential forms of grammar G. L(G) = {x V t *| S x} L(G) = {x V t *| S x} L(G) = SF(G) ∩V t *; that is, the language of G is simply those sentential forms of G that are terminal strings.
7
7 Context-Free Grammars: Concepts and Notation (Cont’d.) Left-most derivation, a top-down parsers Left-most derivation, a top-down parsers lm, lm +, lm * A sentential form produced via a leftmost derivation sequence is called a left sentential form. E.g. of leftmost derivation of F(V+V) E Prefix(E) E V Tail Prefix F Prefix Tail Tail G0G0 E lm Prefix(E) lm F(E) lm F(V Tail) lm F(V+E) lm F(V+V Tail) lm F(V+V)
8
8 Context-Free Grammars: Concepts and Notation (Cont’d.) Right-most derivation (canonical derivation) Right-most derivation (canonical derivation) rm, rm +, rm * Bottom-up parsers A sentential form produced via a rightmost derivation sequence is called a right sentential form. E.g. of rightmost derivation of F(V+V) E Prefix(E) E V Tail Prefix F Prefix Tail Tail G0G0 E rm Prefix(E) rm Prefix(V Tail) rm Prefix(V+E) rm Prefix(V+V Tail) rm Prefix(V+V) rm F(V+V) Same # of steps, but different order
9
9 Context-Free Grammars: Concepts and Notation (Cont’d.) A parse tree A parse tree Rooted by the start symbol Its leaves are grammar symbols or Its leaves are grammar symbols or
10
10 Context-Free Grammars: Concepts and Notation (Cont’d.) A phase of a sentential form is a sequence of symbols descended from a single nonterminal in the parse tree. A phase of a sentential form is a sequence of symbols descended from a single nonterminal in the parse tree. Simple or prime phrase Simple or prime phrase A simple phrase is a sequence of symbols directly derived form a nonterminal. The handle of a sentential form is the left-most simple phrase. The handle of a sentential form is the left-most simple phrase.
11
11 Errors in Context-Free Grammars CFGs are a definitional mechanism. They may have errors, just as programs may. CFGs are a definitional mechanism. They may have errors, just as programs may. Flawed CFG Flawed CFG Useless nonterminals Unreachable Unreachable Derive no terminal string Derive no terminal string S A|B A a B Bb C c Nonterminal C cannot be reached form S Nonterminal B derives no terminal string S is the start symbol. Do exercise 7.
12
12 Errors in Context-Free Grammars (Cont’d.) Ambiguous Ambiguous Grammars that allow different parse trees for the same terminal string It is impossible to decide whether a given CFG is ambiguous It is impossible to decide whether a given CFG is ambiguous
13
13 Errors in Context-Free Grammars (Cont’d.) It is impossible to decide whether a given CFG is ambiguous It is impossible to decide whether a given CFG is ambiguous For certain grammar classes, we can prove that constituent grammars are unambiguous Wrong language Wrong language A general comparison algorithm applicable to all CFGs is known to be impossible A general comparison algorithm applicable to all CFGs is known to be impossible
14
14 Transforming Extended BNF Grammars Extended BNF BNF Extended BNF BNF Extended BNF allows Square bracket [] Square bracket [] Optional list {} Optional list {}
15
15 Parsers and Recognizers Recognizer Recognizer An algorithm that does Boolean-valued test Is this input syntactically valid? Is this input syntactically valid? Parser Parser Answers more general questions Is this input valid? Is this input valid? And, if it is, what is its structure (parse tree)? And, if it is, what is its structure (parse tree)?
16
16 Parsers and Recognizers (Cont’d.) Two general approaches to parsing Two general approaches to parsing Top-down parser Top-down parser Expanding the parse tree (via predictions) in a depth-first manner Preorder traversal of the parse tree Predictive in nature lm LL parser, recursive descent
17
17 Parsers and Recognizers (Cont’d.) Bottom-up parser Bottom-up parser Beginning at its bottom (the leaves of the tree, which are terminal symbols) and determining the productions used to generate the leaves Postorder traversal of the parse tree rm LR parser, shift-reduce parser
18
18 Parsers and Recognizers (Cont’d.) To parse begin SimpleStmt; SimpleStmt; end $
19
19
20
20
21
21 Parsers and Recognizers (Cont’d.) Naming of parsing techniques Naming of parsing techniques Top-down Top-down LL Bottom-up Bottom-up LR The way to parse token sequence L: Leftmost R: Righmost
22
22 Grammar Analysis Algorithms Goal of this section: Goal of this section: Discuss a number of important analysis algorithms for Grammars
23
23 Grammar Analysis Algorithms (Cont’d.) The data structure of a grammar G The data structure of a grammar G
24
24 Grammar Analysis Algorithms (Cont’d.) What nonterminals can derive ? What nonterminals can derive ? A BCD BC B A BCD BC B An iterative marking algorithm
25
25
26
26 Grammar Analysis Algorithms (Cont’d.) First( ) First( ) The set of all the terminal symbols that can begin a sentential form derivable from If is the right-hand side of a production, then First( ) contains terminal symbols that begin strings derivable from First( )={a V t | * a } {if * then { } else } {if * then { } else }
27
27 根據定義, FIRST(X) 集合之計算可依下列三步驟 而得 : 1. If X T, then FIRST(X) = {X}. 2. If X N, X → , then add to FIRST(X). 3. If X N, and X → Y 1 Y 2... Y n, then add all non- elements of FIRST(Y 1 ) to FIRST(X), if FIRST(Y 1 ), then add all non- elements of FIRST(Y 2 ) to FIRST(X),..., if FIRST(Y n ), then add to FIRST(X).
28
28 文法 G 定義如下 : E TE’ E TE’ E’ +TE’ | E’ +TE’ | T FT’ T FT’ T’ *FT’ | T’ *FT’ | F (E) | id F (E) | id 則其 FIRST 求解如下 : FIRST FIRST E(id E’+ T(id T’* F(id
29
29 Follow(A) Follow(A) A is any nonterminal Follow(A) is the set of terminals that may follow A in some sentential form Follow(A)={a V t |S * Aa } {if S + A then { } else } {if S + A then { } else }
30
30 根據定義, FOLLOW(X) 集合之計算可依下列三步 驟而得 : 根據定義, FOLLOW(X) 集合之計算可依下列三步 驟而得 : 1. Put $ into FOLLOW(S). 1. Put $ into FOLLOW(S). 2. For each A B , add all non- elements of FIRST( ) to FOLLOW(B). 2. For each A B , add all non- elements of FIRST( ) to FOLLOW(B). 3. For each A B or A B , where FIRST( ), add all of FOLLOW(A) to FOLLOW(B). 3. For each A B or A B , where FIRST( ), add all of FOLLOW(A) to FOLLOW(B).
31
31 文法 G 定義如下 : E TE’ E TE’ E’ +TE’ | E’ +TE’ | T FT’ T FT’ T’ *FT’ | T’ *FT’ | F (E) | id F (E) | id 則其 FIRST 求解如下 : FIRST FIRST F(id T’* T(id E’+ E(id FOLLOW 之求解 : FOLLOW E $ ) E’ $ ) T + $ ) T’ + $ ) F * + $ )
32
32 Grammar Analysis Algorithms (Cont’d.) Definition of C data structures and subroutines Definition of C data structures and subroutines first_set[X] contains terminal symbols and contains terminal symbols and X is any single vocabulary symbol X is any single vocabulary symbol follow_set[A] contains terminal symbols and contains terminal symbols and A is a nonterminal symbol A is a nonterminal symbol
33
33 It is a subroutine of fill_first_set()
34
34
35
35 E Prefix(E) E V Tail Prefix F Prefix Tail Tail G0G0 The execution of fill_first_set() using grammar G 0
36
36
37
37 E Prefix(E) E V Tail Prefix F Prefix Tail Tail G0G0 The execution of fill_follow_set() using grammar G 0 $,)($,)
38
38 More examples S aSe S B B bBe B C C cCe C d The execution of fill_follow_set() The execution of fill_first_set() $,e $,e $,e
39
39 More examples S ABc A a A B b B The execution of fill_follow_set() The execution of fill_first_set() $ b,c c
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.