Presentation is loading. Please wait.

Presentation is loading. Please wait.

PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 1 컴파일러 입문 제 7 장 LL 구문 분석.

Similar presentations


Presentation on theme: "PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 1 컴파일러 입문 제 7 장 LL 구문 분석."— Presentation transcript:

1

2 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 1 컴파일러 입문 제 7 장 LL 구문 분석

3 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 2 I. 결정적 구문 분석 II. Recursive-descent 파서 III. P PP Predictive 파서 VI. Predictive 파싱 테이블의 구성 V. Strong L LL LL(k) 문법과 LL(k) 문법 목 차목 차

4 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 3 I. 결정적 구문 분석 ▶ Deterministic Top-Down Parsing ::= deterministic selection of production rules to be applied in top-down syntax analysis. ▶ One pass nobackup 1. Input string is scanned once from left to right. 2. Parsing process is deterministic. ▶ Top-down parsing with nobackup ::= deterministic top-down parsing. called LL parsing. "Left to right scanning and Left parse"

5 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 4 ▶ How to decide which production is to be applied: sentential form :  1  2 …  i-1 Xα input string :  1  2 …  i-1  i  i+1 …  n  X   1 |  2... |  k ∈ P 일 때, unique  i 를 보고 X-production 중에 unique 하게 결정.  the condition for no backtracking : FIRST 와 FOLLOW 가 필요. (= LL condition)

6 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 5 FIRST terminals ▶ FIRST(  ) ::= the set of terminals that begin the strings derived from . if   * , then  is also in FIRST(  ).  FIRST(A) ::= { a ∈ V T ∪ {  } | A  * a ,  ∈ V * }. ▶ Computation of FIRST(X), where X ∈ V. 1) if X ∈ V T, then FIRST(X) = {X} 2) if X ∈ V N and X  a  ∈ P, then FIRST(X) = FIRST(X)  {a} if X   ∈ P, then FIRST(X) = FIRST(X)  {  } 3) if X  Y 1 Y 2 …Y k ∈ P and Y 1 Y 2 …Y i-1  * , i then FIRST(X) = FIRST(X)  (  FIRST(Y j ) - {  }). j=1 if Y 1 Y 2 …Y k  * , then FIRST(X) = FIRST(X)  {  }.

7 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 6 ex1) E  TE E  +TE |  T  FT T   FT |  F  (E) | id FIRST(E) = FIRST(T) = FIRST(F) = {(, id} FIRST(E) = {+,  } FIRST(T) = { ,  } ex2) PROGRAM  begin d semi X end X  d semi X X  s Y Y  semi s Y |  FIRST(PROGRAM) = {begin} FIRST(X) = {d,s} FIRST(Y) = {semi,  } Text p.268

8 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 7 연습문제 7.4 (1) - p.299 FIRST 를 구하시오. (1) S  aRTb | bRR R  cRd |  T  RS | TaT

9 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 8 ▶ left-dependency graph - the vertices are the terminal and nonterminal symbols and the arcs go from X to Y if and only if X  X 1...X n Y , where n  0, and each of X 1,...,X n can produce the empty string. ex) S  AB A  aA |  B  bB |  S A Bb a FIRST(S) = {a, , b} FIRST(A) = {a,  } FIRST(B) = {b,  }

10 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 9 ★ In general, A  A 1 A 2...A n if A 1 : non-nullable if A 1 : nullable if A 1 A 2 : nullable A A1A1 A3A3 A A A1A1 A1A1 A2A2 A2A2

11 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 10 FOLLOW ▶ FOLLOW(A) ::= the set of terminals that can appear immediately to the right of A in some sentential form. If A can be the rightmost symbol in some sentential form, then $ is in FOLLOW(A). $ is the input right marker. ::= {a ∈ V T ∪ {$} | S  *  Aa , ,  ∈ V * }. ▶ Computation of FOLLOW(A) 1) FOLLOW(S) = {$} 2) if A   B  ∈ P and  , then FOLLOW(B) = FOLLOW(B) ∪ (FIRST(  ) -  ) 3) if A   B ∈ P or A   B  and   * , then FOLLOW(B) = FOLLOW(B) ∪ FOLLOW(A).

12 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 11 ex) E  TE' E'  +TE' |  T  FT' T'   FT' |  F  (E) | id Nullable = { E, T } FIRST(E) = FIRST(T) = FIRST(F) = {(, id} FIRST(E) = {+,  } FIRST(T) = { ,  } FOLLOW(E) = {),$} FOLLOW(E') = {),$} FOLLOW(T) = {+,),$} FOLLOW(T') = {+,),$} FOLLOW(F) = { ,+,),$} Text p.271

13 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 12 연습문제 7.4 (3) - p.299 FOLLOW 를 구하시오. (3) S  aAa |  A  abS | c

14 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 13 ▶ LL condition ::= no backup condition ::= the condition for deterministic parsing of top-down method. input :  1  2...  i-1  i...  n derived string :  1  2...  i-1 X  X   1 |  2... |  m   i 를 보고 X-production 들 중에서 X 를 확장할 rule 을 결정적으로 선택. ★ A   |  ∈ P, 1. FIRST(  )  FIRST(  ) =  2. if   * , FOLLOW(A)  FIRST(  ) = 

15 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 14 ex) A  aBc | Bc | dAa B  bB |  FIRST(A) = {a,b,c,d} FOLLOW(A) = {$,a} FIRST(B) = {b,  } FOLLOW(B) = {c} 1) A  aBc | Bc | dAa 에서, FIRST(aBc)  FIRST(Bc)  FIRST(dAa) = {a}  {b,c}  {d} =  2) B  bB |  에서, FIRST(bB)  FOLLOW(B) = {b}  {c} =  1), 2) 에 의해 LL 조건을 만족한다.

16 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 15 II. Recursive-descent 파서 ▶ Recursive-descent parsing ::= A top-down method that uses a set of recursive procedures to recognize its input with no backtracking. ▶ create a procedure for each nonterminal. ex) G : S  aA | bB A  aA | c B  bB | d procedure pS; begin if nextsymbol = qa then begin get_nextsymbol; pA end else if nextsymbol = qb then begin get_nextsymbol; pB end else error end;

17 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 16 procedure pA; begin if nextsymbol = qa then begin get_nextsymbol; pA end else if nextsymbol = qc then get_nextsymbol else error end; procedure pB;... (* main *) begin get_nextsymbol; pS; if next_symbol = '$' then accept else error end.  = aac$  Procedure call sequence ::= leftmost derivation

18 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 17 ▶ The main problem in constructing a recursive-descent syntax analyzer is the choice of productions when a procedure is first entered. To resolve this problem, we can compute the lookahead of each production. LOOKAHEAD ▶ LOOKAHEAD of a production  Definition: LOOKAHEAD(A  ) = FIRST({  | S  *  A     *  ∈ V T * }).  Meaning : the set of terminals which can be generated by  and if   * , then FOLLOW(A) is added to the set.  Computing formula: LOOKAHEAD(A  X 1 X 2...X n ) = FIRST(X 1 X 2...X n )  FOLLOW(A)

19 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 18 ex) S  aSA |  A  c Nullable Set = {S} FIRST(S) = {a,  } FOLLOW(S) = {$,c} FIRST(A) = {c} FOLLOW(A) = {$,c} LOOKAHEAD(S  aSA) = FIRST(aSA)  FOLLOW(S) = {a} LOOKAHEAD(S   ) = FIRST(  )  FOLLOW(S) = {$,c} LOOKAHEAD(A  c) = FIRST(c)  FOLLOW(A) = {c}  LOOKAHEAD 를 구하는 순서 : Nullable => FIRST => FOLLOW => LOOKAHEAD

20 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 19 ▶ Strong LL condition  Definition :  A   |  ∈ P, LOOKAHEAD(A   )  LOOKAHEAD(A   ) = .  Meaning : for each distinct pair of productions with the same left-hand side, it can select the unique alternate that derives a string beginning with the input symbol.  Definition : the grammar G is said to be strong LL(1) if it satisfies the strong LL condition. ex) G : S  aSA |  A  c  LOOKAHEAD(S  aSA) = {a}  LOOKAHEAD(S   ) = FOLLOW(S) = {$, c} LOOKAHEAD(S  aSA)  LOOKAHEAD(S   ) =   G 는 strong LL(1) 이다.

21 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 20 ▶ Implementation of Recursive-descent parser  If a grammar is strong LL(1), we can construct a parser for sentences of the grammar using the following scheme.   a ∈ V T, procedure pa; (* get_nextsymbol=scanner *) begin if nextsymbol = qa then get_nextsymbol else error end;  get_nextsymbol : 스캐너에 해당하는 루틴으로 입력 스트림으로부터 토큰 한 개를 읽어 변수 nextsymbol 에 할당하는 일을 한다.

22 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 21   A ∈ V N, procedure pA; var i: integer; begin case nextsymbol of LOOKAHEAD(A  X 1 X 2...X m ): for i := 1 to m do pX i ; LOOKAHEAD(A  Y 1 Y 2...Y n ): for i := 1 to n do pY i ; : LOOKAHEAD(A  Z 1 Z 2...Z r ): for i := 1 to r do pZ i ; LOOKAHEAD(A   ): ; otherwise: error end (* case *) end; Text p.278

23 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 22 ▶ Improving the efficiency and structure of recursive-descent parser 1) Eliminating terminal procedures ::= In practice it is better not to write a procedure for each terminal. Instead the action of advancing the input marker can always be initiated by the nonterminal procedures. In this way many redundant tests can be eliminated. ex) text p.279 [ 예 9] 2) BNF  EBNF : reduce the number of productions and nonterminals. ① repetitive part : { } ② optional part : [ ] ③ alternation : ( | )

24 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 23 ex) [ 예 10] --- text p.281 ::= ' if ' ' then ' [ ' else ' ] procedure pIF; begin if nextsymbol = qif then begin get_nextsymbol; pC; if nextsymbol = qthen then begin get_nextsymbol; pS end else error(10) end else error(20); if nextsymbol = qelse then begin get_nextsymbol; pS end end;

25 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 24 ex) [ 예 11] --- text p.281 ::= ' id ' { ', ' ' id ' } procedure pID_LIST; begin if nextsymbol = qid then begin get_nextsymbol; while (nextsymbol = qcomma) do begin get_nextsymbol; if nextsymbol = qid then get_nextsymbol else error end end;

26 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 25 [ 연습문제 7.8 (2)] --- Text p.300 다음 grammar 를 extended BNF 로 바꾸고 그에 따른 recursive-descent parser 를 위한 procedure 를 작성하시오. ::= ' label ' | ' integer ' ::= ::= ' ; ' | ', '  (', ' ) * ' ; '  ::= ( ' label ' | ' integer ' ) {', ' } ' ; ' *

27 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 26 procedure pD; begin if nextsymbol in [qlabel,qinteger] then begin get_nextsymbol; if nextsymbol = qid then begin get_nextsymbol; while (nextsymbol = qcomma) do begin get_nextsymbol; if nextsymbol = qid then get_nextsymbol else error(3) end else error(2); if nextsymbol = qsemi then get_nextsymbol else error(4) end else error(1) end;

28 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 27  Programming Assignment #1  Implement a recursive-descent syntax analyzer for the grammar given in exercise 5.30(text p. 224).  Problem Specifications - input : SPL program to find a Minimum and a Maximum. - output : left parse - methods : (1) write the get_nextsymbol routine. (2) compute LOOKAHEADs for each production. (3) create a procedure for each nonterminal. (4) assemble the procedures with main program. a set of productions Computation of LOOKAHEADs LOOKAHEADs for each nonterminal

29 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 28 III. Predictive Parsing ▶ Predictive parsing ::= a deterministic parsing method using a stack. The stack contains a sequence of grammar symbols. ▶ Model of a predictive parser Driver routine $  $ : input output stack Table

30 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 29  Current input symbol 과 stack top symbol 사이의 관계에 따라 parsing.  The input buffer contains the string to be parsed, followed by $.  Initial configuration : STACK INPUT $S  $  Parsing table(LL) : parsing action 을 결정지어 줌. ※ M[X,a] = r : stack top symbol 이 X 이고 current symbol 이 a 일 때, r 번 생성 규칙으로 expand. r terminals nonterminals X a

31 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 30 ▶ Parsing Actions X : stack top symbol, a : current input symbol 1. if X = a = $, then accept. 2. if X = a, then pop X and advance input. 3. if X ∈ V N, then if M[X,a] = r (X  ), then replace X by  else error.

32 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 31 ▶ Predictive parsing algorithm set ip to point to the first symbol of  $; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is a terminal or $ then if X = a then pop X from the stack and advance ip else error(1) else /* X is nonterminal */ if M[X,a] = X  Y 1 Y 2...Y k then begin pop X from the stack; push Y k Y k-1,...,Y 1 onto the stack, with Y 1 on top; output the production X  Y 1 Y 2...Y k end else error(2) until X = $ /* stack is empty */ Text p.284

33 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 32 ex) G : 1. S  aSb 2. S  bA 3. A  aA 4. A  b string : aabbbb Parsing Table: a b S A terminals nonterminals 12 34

34 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 33 STACK INPUT ACTIONS OUTPUT $S aabbbb$ expand 1 1 $bSa aabbbb$ pop a and advance $bS abbbb$ expand 1 1 $bbSa abbbb$ pop a and advance $bbS bbbb$ expand 2 2 $bbAb bbbb$ pop b and advance $bbA bbb$ expand 4 4 $bbb bbb$ pop b and advance $bb bb$ pop b and advance $b b$ pop b and advance $ $ Accept ※ How to construct a predictive parsing table for the grammar.

35 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 34 VI. Predictive 파싱 테이블의 구성 ▶ main idea : If A   is a production with a in FIRST(  ), then the parser will expand A by  when the current input symbol is a. And if   * , then we should again expand A by  when the current input symbol is in FOLLOW(A). ▶ parsing table(LL): M[X,a] = r : expand X with r-production blank : error VTVT a X VNVN

36 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 35 ▶ Algorithm : for each production A , 1.  a ∈ FIRST(  ), M[A,a] := 2. if   * , then  b ∈ FOLLOW(A), M[A,b] :=. ex) G: 1. E  TE' 2. E'  +TE' 3. E'   4. T  FT' 5. T'   FT' 6. T'   7. F  (E) 8. F  id FIRST(E)=FIRST(T)=FIRST(F)={ (, id } FIRST(E')={ +,  } FIRST(T')={ ,  } FOLLOW(E) = FOLLOW(E') = { ), $ } FOLLOW(T) = FOLLOW(T') = { +, ), $ } FOLLOW(F) = { +, , ), $ }

37 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 36 1 2 4 Parsing Table: Terminals id+*()$ E E'E' T T'T' F8 1 33 66 7 65 4 Nonterminals

38 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 37 ▶ LL(1) Grammar ::= a grammar whose parsing table has no multiply-defined entries.  multiply 정의되면 어느 rule 로 expand 해야 할 지 결정할 수 없기 때 문에 deterministic 하게 parsing 할 수 없다. ▶ LL(1) condition: A   | , 1. FIRST(  )  FIRST(  ) = . 2. if   , then FOLLOW(A)  FIRST(  ) = . ex) G : 1. S  iCtSS' 2. S  a 3. S'  eS 4. S'   5. C  b FIRST(S) = {i,a} FOLLOW(S) = {$,e} FIRST(S') = {e,  } FOLLOW(S') = {$,e} FIRST(C) = {b} FOLLOW(C) = {t} *

39 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 38 Parsing Table: M[S',e] := 로 중복으로 정의되었음. 여기서, stack top 이 S' 이고 input symbol 이 e 일 때 3 번 rule 로 expand 해야 할 지, 4 번 rule 로 expand 해야 하는지 알 수 없다. 그러므로 G 는 LL(1) grammar 가 아니다. ex) [ 예제 15] --- text p.291 G : S  aA | abA  : abab A  Ab | a abeit$ S S'S' C 2 5 1 4 3,4

40 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 39 V. Strong LL(k) and LL(k) Grammars ▶ FIRST k (  ) = {  |   * , |  | = k or    and |  | < k} strong LL(k) ▶ G is said to be strong LL(k), for some fixed integer k > 0, if whenever there are two leftmost derivations. 1. S  *  A     *  x ∈ V T *, and 2. S  * A     * y ∈ V T * such that 3. FIRST k (x) = FIRST k (y). It follows that 4.  = . ▶ Meaning: Suppose we consider any state of the parse in which A is the nonterminal currently being parsed and FIRST k (x) is the k-lookahead at the current point. Then, if the k-lookahead is same, the two productions A   and A   are identical. Any other information provided by the closed portion and the open portion of the current state of the parse will be disregarded.

41 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 40 ▶ S   A ,  : closed portion,  : open portion ▶ Two states of the parse FIRST k (x) = FIRST k (y) ===>  = . * S  A   x S  A  y

42 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 41 ▶ Def) LL(k) grammar: 1. S   A      x ∈ V T *, and 2. S   A      y ∈ V T * such that 3. FIRST k (x) = FIRST k (y). It follows that 4.  = . ex) S  aAaa | bAba A  b |  S S a A a a b A b a b   lookahead 가 ba 일 때 A  b, A   중 어느 rule 을 택할 수 있는가 ? 이제 본 symbol 이 a 이면 A  b 를 선택하고, b 이면 A   를 선택한다. 따라서 SLL(2) 는 아니며 LL(2) 가 된다. * * * *

43 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 42 ▶ SLL(k) and LL(k) ▶ strong LL(1)  LL(1) Proof) (  ) clear! (  ) Suppose that G is not strong LL(1). Then, by definition, there are two distinct productions A   and A   such that, S   1 A  1   1  1   1 1  1   1 1  1 S   2 A  2   2  2   2 2  2   2 2  2 and FIRST( 1  1 ) = FIRST( 2  2 ). SLL(k) LL(k) * * * * * *

44 PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 43 Now we must prove that G is not LL(1). 1) 1 = 2 = , G is not LL(1). Indeed, it is ambiguous. 2) one (or both) of 1 and 2 is not . 1  . FIRST 1 ( 1  1 ) = FIRST 1 ( 1 ) = FIRST 1 ( 2  2 ). but then, S   2 A  2   2  2   2 1  2   2 1  2 S   2 A  2   2  2   2 2  2   2 2  2 satisfy the property FIRST 1 ( 1  2 ) = FIRST 1 ( 1 ) = FIRST 1 ( 2  2 ). Thus, by definition, G is not LL(1). * * * * * *


Download ppt "PL Lab, DongGuk University Compiler Lecture Note, LL ParsingPage 1 컴파일러 입문 제 7 장 LL 구문 분석."

Similar presentations


Ads by Google