Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fall Compiler Principles Lecture 4: Parsing part 3

Similar presentations


Presentation on theme: "Fall Compiler Principles Lecture 4: Parsing part 3"— Presentation transcript:

1 Fall 2015-2016 Compiler Principles Lecture 4: Parsing part 3
Roman Manevich Ben-Gurion University

2 Tentative syllabus Front End Intermediate Representation Optimizations
Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Operational Semantics Lowering Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection

3 Previously Top-down parsing Recursive descent Handling conflicts
LL(k) via pushdown automata

4 Agenda Shift-reduce (LR) parsing model Building the LR parsing table
Types of conflicts

5 Shift-reduce parsing

6 Some terminology The opposite of derivation is called reduction
Let A  α be a production rule Let βAµ be a sentential form A reduction replaces α with A: βαµ  βAµ A handle is a substring of a sentential form that is reduced during a series of steps in a rightmost derivation

7 Using shift and reduce to parse
E  E + (E) E  i action Input Stack shift 1 + (2) + (3) reduce + (2) + (3) 1 E (2) + (3) E + 2) + (3) E + ( ) + (3) E + (2 E + (E + (3) E + (E) (3) 3) ) E + (3 accept On each step we either: - shift a symbol from the input to the stack, or - reduce symbols on the stack

8 How will the parser know what to do?
A state will keep the info gathered so far A stack will maintain formerly reduced handles and partially reduced handles A table will tell it “what to do” based on Current state, Symbol on top of stack, and k-next tokens (k≥0)

9 Model of an LR parser Input Parser table Stack Output $ id +
LR Parsing program State Output Parser table

10 States and LR(0) items E  E * B | E + B | B B  0 | 1 The state will “remember” the potential derivation rules given the part that was already identified For example, if we have already identified E then the state will remember the two alternatives: (1) E  E * B, (2) E  E + B Actually, we will also remember where we are in each of them: (1) E  E ● * B, (2) E  E ● + B A derivation rule with a location marker is called an LR(0) item The state is actually a set of LR(0) items For example: q13 = { E  E ● * B , E  E ● + B}

11 Intuition We gather the input token by token until we find a right-hand side of a rule and then we replace it with the nonterminal on the left side Going over a token and remembering it in the stack is a shift Each shift moves to a state that remembers what we’ve seen so far A reduce replaces a string in the stack with the nonterminal that derives it

12 Why do we need the stack? E  E + (E) E  i
action Input Stack shift 1 + (2) + (3) reduce + (2) + (3) 1 E (2) + (3) E + 2) + (3) E + ( ) + (3) E + (2 E + (E + (3) E + (E) (3) 3) ) E + (3 accept Suppose so far we have discovered E  1 and gather information on “E +” In the given grammar this can only mean E  E + ● (E) Suppose state q represents this possibility Now, the next token is (, and we need to ignore q for a minute, and work on E  2 to obtain E+(E) Therefore, we push q to the stack, and after identifying E, we pop it to continue

13 LR parser stack Input Stack Output $ id + LR Parsing program 5 T 2 + 7
State state 5 T 2 + 7 id Output symbol goto action

14 LR parsing table State 0 is the initial state state terminals
non-terminals rk gm 1 . . . acc shift/reduce actions goto part sn error shift state n reduce by rule k goto state m accept

15 LR(0) parser table example
goto action STATE T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9 (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Always entire row of rk Always entire row of shift and gotos (possibly accept)

16 LR parser moves

17 Shift move If action[q, t] = sn then push t, push n Input Stack Output
$ t Stack LR Parsing program current state q . . . Output goto action n is the next state If action[q, t] = sn then push t, push n

18 Result of shift If action[q, t] = sn then push t, push n Input Stack
$ t Stack LR Parsing program n t q . . . Output goto action If action[q, t] = sn then push t, push n

19 Reduce move If action[qn, t] = rk Production: (k) A  σ1… σn
Input $ t Stack LR Parsing program qn q Output 2*n goto action Rule k If action[qn, t] = rk Production: (k) A  σ1… σn Top of stack looks like q1 σ1… qn σn Pop qn σn… q1 σ1 If goto[q, A] = q’ then push A, push q’

20 Result of reduce move If action[qn, t] = rk Production: (k) A  σ1… σn
Input $ t Stack LR Parsing program Output q’ A q goto action If action[qn, t] = rk Production: (k) A  σ1… σn Top of stack looks like q1 σ1… qn σn Pop qn σn… q1 σ1 If goto[q, A] = q’ then push A, push q’

21 Accept move If action[q, t] = accept parsing completed Input Stack
$ t Stack LR Parsing program q . . . Output goto action If action[q, t] = accept parsing completed

22 Error move If action[q, t] = error
Input $ t Stack LR Parsing program q . . . Output goto action If action[q, t] = error parsing discovered a syntactic error

23 Example of Shift-reduce parser run

24 Parsing id+id$ Initialize with state 0 Stack Input Action id + id $ ?
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ ? goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9 Initialize with state 0

25 Parsing id+id$ Stack Input Action id + id $ s5 (1) S  E $ (2) E  T
(4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

26 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

27 Parsing id+id$ pop id 5 Stack Input Action id + id $ s5 0 id 5 + id $
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9 pop id 5

28 Parsing id+id$ push T 6 Stack Input Action id + id $ s5 0 id 5 + id $
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9 push T 6

29 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

30 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

31 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 0 E 1 + 3 id $ goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

32 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 0 E 1 + 3 id $ 0 E id 5 $ goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

33 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 0 E 1 + 3 id $ 0 E id 5 $ 0 E T 4 r3 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

34 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 0 E 1 + 3 id $ 0 E id 5 $ 0 E T 4 r3 s2 goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

35 Parsing id+id$ Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) Stack Input Action id + id $ s5 0 id 5 + id $ r4 0 T 6 r2 0 E 1 s3 0 E 1 + 3 id $ 0 E id 5 $ 0 E T 4 r3 s2 0 E 1 $ 2 acc goto action S T E $ ) ( + id g6 g1 s7 s5 s2 s3 1 acc 2 g4 3 r3 4 r4 5 r2 6 g8 7 s9 8 r5 9

36 Constructing an LR(0) parsing table

37 Overall process Construct a (determinized) transition diagram from LR(0) items If there are conflicts – stop Grammar is not LR(0) Otherwise, fill table entries from diagram

38 N  αβ LR(0) item Already matched To be matched Input
Hypothesis about αβ being a possible handle, so far we’ve matched α, expecting to see β

39 LR(0) items N  αβ Shift Item N  αβ Reduce Item

40 LR(0) items enumeration example
All items can be obtained by placing a dot at every position for every production: LR(0) items 1: S  E$ 2: S  E  $ 3: S  E $  4: E   T 5: E  T  6: E   E + T 7: E  E  + T 8: E  E +  T 9: E  E + T  10: T   id 11: T  id  12: T   (E) 13: T  ( E) 14: T  (E ) 15: T  (E)  Grammar (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

41 Operations for transition diagram construction
Initial = {S’S$} For an item set I solve: Closure(I) = Closure(I  {Xµ is in grammar| NαXβ in I}) Goto(I, σ) = { Nασβ | Nασβ in I} σ is either a terminal or nonterminal

42 Initial example Initial = {S  E $} Grammar (1) S  E $ (2) E  T
(4) T  id (5) T  ( E )

43 Closure example Initial = {S  E $}
Closure({S  E $}) = S  E $ E  T E  E + T T  id T  ( E ) Grammar (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

44 Goto example Initial = {S  E $}
Closure({S  E $}) = S  E $ E  T E  E + T T  id T  ( E ) Goto({S  E $ , E  E + T, T  id}, E) = {S  E $, E  E + T} Grammar (1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E )

45 Constructing the transition diagram
Start with state 0 containing item Closure({S  E $}) Repeat until no new states are discovered For every state p containing item set Ip, and symbol N, compute state q containing item set Iq = Closure(Goto(Ip, N)) Why does it terminate?

46 LR(0) automaton example
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) q6 E  T q0 T q7 T S  E$ E  T E  E + T T  id T  (E) T  (E) E  T E  E + T T  id T  (E) ( q5 id T  id id E ( E ( i q1 q8 q3 S  E$ E  E+ T E  E+T T  id T  (E) T  (E) E  E+T + + $ ) q2 q9 S  E$ T  (E)  T q4 E  E + T

47 LR(0) automaton construction example
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) q0 S  E$ Initialize

48 LR(0) automaton construction example
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) q0 S  E$ E  T E  E + T T  id T  (E) apply Closure

49 LR(0) automaton construction example
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) E  T q6 T q0 S  E$ E  T E  E + T T  id T  (E) T  (E) E  T E  E + T T  id T  (E) ( T  id q5 id S  E$ E  E+ T q1 E

50 LR(0) automaton construction example
(1) S  E $ (2) E  T (3) E  E + T (4) T  id (5) T  ( E ) q6 E  T q0 T q7 T S  E$ E  T E  E + T T  id T  (E) T  (E) E  T E  E + T T  id T  (E) non-terminal transition corresponds to goto action in parse table ( q5 id T  id id E ( E ( i q1 q8 q3 S  E$ E  E+ T terminal transition corresponds to shift action in parse table E  E+T T  id T  (E) T  (E) E  E+T + + $ ) q2 q9 S  E$ T  (E)  T q4 E  E + T a single reduce item corresponds to reduce action

51 LR(0) conflicts

52 Conflicts Can construct a diagram for every grammar but some may introduce conflicts shift-reduce conflict: an item set contains at least one shift item and one reduce item reduce-reduce conflict: an item set contains two reduce items What about shift-shift conflicts?

53 Shift-reduce conflict example
T q0 S  E$ E  T E  E + T T  id T  (E) T  id[E] ( q5 id T  id T  id[E] E Shift/reduce conflict S  E $ E  T E  E + T T  id T  ( E ) T  id[E]

54 Reduce-reduce conflict example
q0 T S  E$ E  T E  V E  E + T T  id V  id T  (E) T  i[E] ( q5 id T  id V  id E reduce/reduce conflict S  E $ E  T E  V E  E + T T  id V  id T  ( E )

55 LR(0) conflicts Any grammar with an -rule cannot be LR(0)
Inherent shift/reduce conflict A  – reduce item P αAβ – shift item A  can always be predicted from P αAβ Similar to FIRST-FOLLOW conflicts in LL(1) parsing Similar solution

56 LR parsing variants

57 LR variants LR(0) – what we’ve seen so far SLR(0) LR(1) LALR(1)
Removes infeasible reduce actions via FOLLOW set reasoning LR(1) LR(0) with one lookahead token in items LALR(1) LR(1) with merging of states with same LR(0) component

58 SLR parsing

59 SRL parsing A handle should not be reduced to a non-terminal N if the lookahead is a token that cannot follow N A reduce item N  α is applicable only when the lookahead is in FOLLOW(N) If b is not in FOLLOW(N) we just proved there is no terminating derivation S * βNb and thus it is safe to remove the reduce item from the conflicted state Differs from LR(0) only on the ACTION table Now a row in the parsing table may contain both shift actions and reduce actions and we need to consult the current token to decide which one to take

60 Lookahead token from the input
SLR action table Lookahead token from the input State id + ( ) [ ] $ shift 1 2 accept 3 4 EE+T 5 Tid r5, s6 6 ET 7 8 9 T(E) state action q0 shift q1 q2 q3 q4 EE+T q5 Tid q6 ET q7 q8 q9 TE vs. SLR – use 1 token look-ahead LR(0) – no look-ahead [ is not in FOLLOW(T) … as before… T  id T  id[E]

61 Next lecture: SLR/LR(1)/LALR(1)/Parser generation


Download ppt "Fall Compiler Principles Lecture 4: Parsing part 3"

Similar presentations


Ads by Google