# CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155.

## Presentation on theme: "CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155."— Presentation transcript:

CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155 Storrs, CT 06269-3155 akiayias@cse.uconn.edu http://www.cse.uconn.edu/~akiayias

CH4.2 CSE244 Picture So Far  SLR construction: based on canonical collection of LR(0) items – gives rise to canonical LR(0) parsing table.  No multiply defined labels => Grammar is called “SLR(1)”  More general class: LR(1) grammars. Using the notion of LR(1) item and the canonical LR(1) parsing table.

CH4.3 CSE244 LR(1) Items  DEF. A LR(1) item is a production with a marker together with a terminal: E.g. []  DEF. A LR(1) item is a production with a marker together with a terminal: E.g. [ S  aA.Be, c ] intuition: it indicates how much of a certain production we have seen already (aA) + what we could expect next (Be) + a lookahead that agrees with what should follow in the input if we ever do Reduce by the production S  aABe By incorporating such lookahead information into the item concept we will make more wise reduce decisions.  Direct use of lookahead in an LR(1) item is only performed in considering reduce actions. (I.e. when marker is in the rightmost).  Core of an LR(1) item [] is the LR(0) item  Core of an LR(1) item [ S  aA.Be, c ] is the LR(0) item S  aA.Be   Different LR(1) items may share the same core.

CH4.4 CSE244 Usefulness of LR(1) items  E.g. if we have two LR(1) items of the form   [ A  ., a ] [ B  ., b ] we will take advantage of the lookahead to decide which reduction to use (the same setting would perhaps produce a reduce/reduce conflict in the SLR approach).   How the Notion of Validity changes:   An item [ A   1.  2, a ] is valid for a viable prefix  1 if we have a rightmost derivation that yields  Aaw which in one step yields  1  2 aw

CH4.5 CSE244 Constructing the Canonical Collection of LR(1) items   Initial item: [ S’ .S, \$]   Closure. (more refined) if [A .B , a] belongs to the set of items, and B   is a production of the grammar, then: we add the item [B . , b] for all b  FIRST(  a)  Goto. (the same)  Goto. (the same) A state containing [A .X , a] will move to a state containing [A  X. , a] with label X   Every state is closed according to Closure.   Every state has transitions according to Goto.

CH4.6 CSE244 Constructing the LR(1) Parsing Table  Shift actions: (same) If is in state I k and I k moves to state I m with label then we add the action action[k, ] = “shift m”  Shift actions: (same) If [A .b , a] is in state I k and I k moves to state I m with label b then we add the action action[k, b] = “shift m”  Reduce actions: (more refined) If is in state I k then we add the action: “Reduce ” into action[A, ] Observe that we don’t use information from FOLLOW(A) anymore.  Reduce actions: (more refined) If [A ., a] is in state I k then we add the action: “Reduce A  ” into action[A, a] Observe that we don’t use information from FOLLOW(A) anymore.  Goto part of the table is as before.

CH4.7 CSE244 Example I S’  S S  CC C  c C | d FIRST S c d C c d construction

CH4.8 CSE244 Example II S’  S S  L = R | R L  * R | id R  L FIRST S * id L * id R * id

CH4.9 CSE244 LR(1) more general to SLR(1): S’  S S  L = R | R L  * R | id R  L I 0 = {[ S’ .S, \$ ] [S .L = R, \$ ] [S .R, \$ ] [L .* R, = / \$ ] [L . id, = / \$ ] [R .L, \$ ] } I 1 = {[ S’  S., \$ ] } I 2 = {[ S  L. = R, \$ ] [R  L., \$ ] } I 3 = {[ S  R., \$ ] } I 4 = {[ L  *.R, = / \$ ] [R .L, = / \$ ] [L .* R, = / \$ ] [L . id, = / \$ ] } action[2, = ] ? s6 (because of S  L. = R ) THERE IS NO CONFLICT ANYMORE I 5 = { [L  id., = / \$ ] } I 6 = {[ S  L =. R, \$ ] [R .L, \$ ] [L .* R, \$ ] [L . id, \$ ] } I 7 = {[L  *R., = / \$ ]} I 8 = {[R  L., = / \$ ]} I 10 = {[L  *R., \$ ]} I 11 = { [L  id., \$ ] } I 12 = {[R  L., \$ ]} I 9 = {[ L  *.R, \$ ] [R .L, \$ ] [L .* R, \$ ] [L . id, \$ ] }

CH4.10 CSE244 LALR Parsing  Canonical sets of LR(1) items  Number of states much larger than in the SLR construction  LR(1) = Order of thousands for a standard prog. Lang.  SLR(1) = order of hundreds for a standard prog. Lang.  LALR(1) (lookahead-LR)  A tradeoff:  Collapse states of the LR(1) table that have the same core (the “LR(0)” part of each state)  LALR never introduces a Shift/Reduce Conflict if LR(1) doesn’t.  It might introduce a Reduce/Reduce Conflict (that did not exist in the LR(1))…  Still much better than SLR(1) (larger set of languages)  … but smaller than LR(1), actually ~ SLR(1)  What Yacc and most compilers employ.

CH4.11 CSE244 Collapsing states with the same core.  E.g., If I 3 I 6 collapse then whenever the LALR(1) parser puts I 36 into the stack, the LR(1) parser would have either I 3 or I 6  A shift/reduce action would not be introduced by the LALR “collapse”  Indeed if the LALR(1) has a Shift/Reduce conflict this conflict should also exist in the LR(1) version: this is because two states with the same core would have the same outgoing arrows.  On the other hand a reduce/reduce conflict may be introduced.  Still LALR(1) preferred: table proportional to SLR(1)  Direct construction is also possible.

CH4.12 CSE244 Error Recovery in LR Parsing  For a given stack \$...I i and input symbols it holds that action[i,] = empty  For a given stack \$...I i and input symbols s…s’…\$ it holds that action[i,s] = empty  Panic-mode error recovery.

CH4.13 CSE244 Panic Recovery Strategy I  Scan down the stack till a state I j is found  I j moves with the non-terminal A to some state I k  I k moves with s’ to some state I k’  Proceed as follows:  Pop all states till I j  Push A and state I k  Discard all symbols from the input till s’  There may be many choices as above.  [essentially the parser in this way determines that a string that is produced by A has an error; it assumes it is correct and advances]  Error message: construct of type “A” has error at location X

CH4.14 CSE244 Panic Recovery Strategy II  Scan down the stack till a state I j is found  I j moves with the terminal t to some state I k  I k with s’ has a valid action.  Proceed as follows:  Pop all states till I j  Push t and state I k  Discard all symbols from the input till s’  There may be many choices as above.  Error message: “missing t”

CH4.15 CSE244Example E’  E E  E + E | | E * E | ( E ) | id id+*()\$E 0 s3 s3e1e1s2e2e11 1e3s4s5e3e2acc 2s3e1e1s2e2e16 3r4r4r4r4r4r4 4s3e1e1s2e2e17 5s3e1e1s2e2e18 6e3s4s5e3s9e4 7r1r1s5r1r1 r1 r1 8r2r2r2r2r2r2 9r3r3r3r3r3r3 action goto

CH4.16 CSE244 Collection of LR(0) items E’  E E  E + E | | E * E | ( E ) | id I 0 I 2 I 5 I 8 E’ .EE  (. E ) E  E *. E E  E * E. E .E + EE .E + E E .E + EE  E. + E E .E * E E .E * E E .E * E E  E. * E E .( E )E .( E ) E .( E ) E .idE .id E .id I 1 I 3 I 6 I 9 E’  E.E  id. E  ( E. ) E  ( E ). E  E. + E E  E. * E I 4 E  E. * E E  E +. E E .E + EI 7 E .E * E E  E + E. E .( E )E  E. + E E .id E  E. * E Follow(E’)=\$ Follow(E)=+*)\$

CH4.17 CSE244 The parsing table id+*()\$E 0s3s21 1s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.18 CSE244Error-handling id+*()\$E 0s3e1s21 1s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.19 CSE244Error-handling I 0 I 2 I 5 I 8 E’ .EE  (. E ) E  E *. E E  E * E. E .E + EE .E + E E .E + EE  E. + E E .E * E E .E * E E .E * E E  E. * E E .( E )E .( E ) E .( E ) E .idE .id E .id e1 Push E into the stack and move to state 1 “missing operand” : e1 Push id into the stack and change to state 3 “missing operand”

CH4.20 CSE244Error-handling id+*()\$E 0s3e1e1 s2e1 1 1 s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.21 CSE244Error-handling id+*()\$E 0s3e1e1 s2e2 e1 1 1 s4s5e2 acc 2s3s26 3r4r4r4r4 4s3e1 s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.22 CSE244Error-handling e2 remove “)” from input. “unbalanced right parenthesis” Try the input id+)

CH4.23 CSE244 Error-handling state 1 id+*()\$E 0s3e1e1 s2e2 e1 1 1e3 s4s5acc 2s3s26 3r4r4r4r4 4s3s27 5s3s28 6s4s5s9 7s4/r1s5/r1r1r1 8s4/r2s5/r2r2r2 9r3r3r3r3

CH4.24 CSE244Error-Handling I 1 I 3 I 6 I 9 E’  E.E  id. E  ( E. ) E  ( E ). E  E. + E E  E. * E I 4 E  E. * E E  E +. E E .E + EI 7 E .E * E E  E + E. E .( E )E  E. + E E .id E  E. * E e3 Push + into the stack and change to state 4 “missing operator”

CH4.25 CSE244 Intro to Translation  Side-effects and Translation Schemes.  Do the construction as before but:  Side-effect in front of a symbol will be executed in a state when we make the move following that symbol to another state.  Side-effects on the rightmost end are executed during reduce actions. E’  E E  E + E {print(+)} | E * E {print(*)} | {parenthesis++} ( E ) {parenthesis--} | id { print(id); print(parenthesis); } Do for example id*(id+id)\$ side-effects attached to the symbols to the right of them.

Download ppt "CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155."

Similar presentations