Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU.

Similar presentations


Presentation on theme: "CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU."— Presentation transcript:

1 CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU

2 CPSC46002 Predictive Parsing Summary  First and Follow sets are used to construct predictive tables For non-terminal A and input t, use a production A  a where t  First( a ) For non-terminal A and input t, if e  First(A) and t  Follow( a ), then use a production A  a where e  First( a )  Recursive-descent without backtracking do not need the parse table explicitly

3 CPSC46003 Bottom-Up Parsing(1)  Bottom-up parsing is more general than top-down parsing And just as efficient Builds on ideas in top-down parsing  Bottom-up is the preferred method in practice

4 CPSC46004 Bottom-Up Parsing(2)  Table-driven using an explicit stack (non-recursive)  Stack can be viewed as containing both terminals and nonterminals  Basic operations: Shift: Move terminals from input stream to the stack until the right-hand side of an appropriate production rule has been identified in the stack Reduce: Replace the sentential form appearing on the stack (considered from top that matched the right-hand side of an appropriate production rule) with the nonterminal appearing on the left-hand side of the production.

5 CPSC46005 An Introductory Example  Bottom-up parsers don ’ t need left-factored grammars  Hence we can revert to the “ natural ” grammar for our example: E  T + E | T T  num * T | num | (E)  Consider the string: num * num + num

6 CPSC46006 The Idea Bottom-up parsing reduces a string to the start symbol by inverting productions: E E  T + E T + E E  T T + T T  num T + num T  num * T num * T + num T  num num * num + num InputProductions Used

7 CPSC46007 Right-most Derivation  In a right-most derivation, the rightmost nonterminal of a sentential form is replaced at each derivation step.  Question: find the rightmost derivation of the string num* num + num

8 CPSC46008 Observation  Read the productions found by bottom-up parse in reverse (i.e., from bottom to top)  This is a rightmost derivation! E E  T + E T + E E  T T + T T  num T + num T  num * T num * T + num T  num num * num + num

9 CPSC46009 Important Facts A bottom-up parser traces a rightmost derivation in reverse

10 CPSC460010 A Bottom-up Parse E T + E T + T T + num num * T + num num * num + num E TE + num * T T

11 CPSC460011 A Bottom-up Parse in Detail (1) + num * num * num + num

12 CPSC460012 A Bottom-up Parse in Detail (2) num * T + num num * num + num + num * T

13 CPSC460013 A Bottom-up Parse in Detail (3) T + num num * T + num num * num + num T + num * T

14 CPSC460014 A Bottom-up Parse in Detail (4) T + T T + num num * T + num num * num + num T + num * T T

15 CPSC460015 A Bottom-up Parse in Detail (5) T + E T + T T + num num * T + num num * num + num TE +num * T T

16 CPSC460016 A Bottom-up Parse in Detail (6) E T + E T + T T + num num * T + num num * num + num E TE + num * T T

17 CPSC460017 Bottom-up Parsing A trivial bottom-up parsing algorithm Let I = input string repeat pick a non-empty substring  of I where X   is a production if no such , backtrack replace one  by X in I until I = “ S ” (the start symbol) or all possibilities are exhausted

18 CPSC460018 Observations  The termination of the algorithm (when/if)  Running time of the algorithm  If there are more than one choices for the sub-string to be replaced (reduce) which one to choose?

19 CPSC460019 Where Do Reductions Happen Recall A bottom-up parser traces a rightmost derivation in reverse Let  be a rightmost sentential form Assume the next reduction is by X   Then  is a string of terminals Why? Because  X    is a step in a right-most derivation

20 CPSC460020 Shift-Reduce Parsing Bottom-up parsing uses only two kinds of actions: Shift Reduce

21 CPSC460021 Shift  Shift: Move # (marking the part of the input that has been processed) one place to the right Shifts a terminal to the left string ABC#xyz  ABCx#yz

22 CPSC460022 Reduce  Apply an inverse production at the right end of the left string If A  xy is a production, then Cbxy#ijk  CbA#ijk

23 CPSC460023 The Example with Shift-Reduce Parsing reduce T  num T + num # shiftT + # num shiftnum # * num + num shiftnum * # num + num shift#num * num + num E # reduce E  T + E T + E # reduce E  T T + T # shiftT # + num reduce T  num * T num * T # + num reduce T  num num * num # + num

24 CPSC460024 A Shift-Reduce Parse in Detail (1) + num *  #num * num + num

25 CPSC460025 A Shift-Reduce Parse in Detail (2) + num *  num # * num + num #num * num + num

26 CPSC460026 A Shift-Reduce Parse in Detail (3) + num *  num # * num + num num * # num + num #num * num + num

27 CPSC460027 A Shift-Reduce Parse in Detail (4) + num *  num # * num + num num * # num + num #num * num + num num * num # + num

28 CPSC460028 A Shift-Reduce Parse in Detail (5) + num * T num # * num + num num * # num + num #num * num + num num * T # + num num * num # + num 

29 CPSC460029 A Shift-Reduce Parse in Detail (6) T + num * T num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

30 CPSC460030 A Shift-Reduce Parse in Detail (7) T + num * T T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

31 CPSC460031 A Shift-Reduce Parse in Detail (8) T + num * T T + num # T + # num num # * num + num num * # num + num #num * num + num T # + num num * T # + num num * num # + num 

32 CPSC460032 A Shift-Reduce Parse in Detail (9) T +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + T # T # + num num * T # + num num * num # + num 

33 CPSC460033 A Shift-Reduce Parse in Detail (10) TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num T + E # T + T # T # + num num * T # + num num * num # + num 

34 CPSC460034 A Shift-Reduce Parse in Detail (11) E TE +num * T T T + num # T + # num num # * num + num num * # num + num #num * num + num E # T + E # T + T # T # + num num * T # + num num * num # + num 

35 CPSC460035 The Stack  Left string can be implemented by a stack Top of the stack is the #  Shift pushes a terminal on the stack  Reduce pops 0 or more symbols off of the stack (production rhs) and pushes a non-terminal on the stack (production lhs)

36 CPSC460036 Key Issue (will be resolved by algorithms)  How do we decide when to shift or reduce? Consider step: num # * num + num We could reduce by T  num giving T # * num + num A fatal mistake: No way to reduce to the start symbol E

37 CPSC460037 Conflicts  Generic shift-reduce strategy: If there is a handle on top of the stack, reduce Otherwise, shift  But what if there is a choice? If it is legal to shift or reduce, there is a shift-reduce conflict If it is legal to reduce by two different productions, there is a reduce-reduce conflict

38 CPSC460038 Conflict Example Consider the ambiguous grammar: num| (E)| E * E| E + E  E

39 CPSC460039 One Shift-Reduce Parse E # reduce E  E + E E + E #... reduce E  E * E E * E # + num shift#num * num + num reduce E  num E + num# shiftE + # num shiftE # + num InputAction

40 CPSC460040 Another Shift-Reduce Parse E # reduce E  E * E E * E #... shiftE * E # + num shift#num * num + num reduce E  E + E E * E + E# reduce E  num E * E + num # shiftE * E + # num Input Action

41 CPSC460041 Observations  In the second step E * E # + num we can either shift or reduce by E  E * E  Choice determines associativity of + and *  As noted previously, grammar can be rewritten to enforce precedence  Precedence declarations are an alternative

42 CPSC460042 Overview  LR(k) parsing L: scan input Left to right R: produce rightmost derivation k tokens of lookahead  LR(0) zero tokens of look-ahead  SLR Simple LR: like LR(0) but uses FOLLOW sets to build more “ precise ” parsing tables

43 CPSC460043 Basic Terminologies  Handle A substring that matches the right side of a production whose reduction with that production ’ s left side constitutes one step of the rightmost derivation of the string from the start nonterminal of the grammar

44 CPSC460044 Model of Shift-Reduce Parsing  Stack + input = current right-sentential form.  Locate the handle during parsing: shift zero or more terminals (tokens) onto the stack until a handle  is on top of the stack.  Replace the handle with a proper non-terminal (Handle Pruning): reduce  to A where A 

45 CPSC460045 Model of an LR Parser

46 CPSC460046 Problem: when to shift, when to reduce?  Recall grammar: E  T + E | T T  num * T | num | (E)  how to know when to reduce and when to shift?

47 CPSC460047 Model of Shift-Reduce Parsing  Stack + input = current right-sentential form.  Locate the handle during the parsing: shift zero or terminals onto the stack until a handle is  on top of the stack.  Replace the handle with a proper non-terminal (Handle Pruning)

48 CPSC460048 What we need to know to do LR parsing  LR(0) states describe states in which the parser can be Note: LR(0) states are used by both LR(0) and SLR parsers  Parsing tables transitions between LR(0) states, actions to take at transition:  shift, reduce, accept, error  How to construct LR(0) states  How to construct parsing tables  How to drive the parser

49 CPSC460049 An LR(0) state = a set of LR(0) items  An LR(0) item [X --> a.b] says that the parser is looking for an X it has an a  on top of the stack expects to find in the input a string derived from b.  Notes: [X --> a.ab] means that if a is on the input, it can be shifted. That is:  a is a correct token to see on the input, and  shifting a would not “ over-shift ” (still a viable prefix). [X -->a.] means that we could reduce X

50 CPSC460050 LR(0) states S’ . E E . T E .T + E T .(E) T .num * T T .num S’  E. E  T. E  T. + E T  num. * T T  num. T  (. E) E .T E .T + E T .(E) T .num * T T .num E  T + E. E  T +. E E .T E .T + E T .(E) T .num * T T .num T  num *.T T .(E) T .num * T T .num T  num * T. T  (E.) T  (E). E T ( num * ) E E T ( ( T  (

51 CPSC460051 SLR Parsing  Remember the state of the automaton on each prefix of the stack  Change stack to contain pairs  Symbol, DFA State 

52 CPSC460052 SLR Parsing (Contd.)  For a stack  sym 1, state 1 ...  sym n, state n  state n is the final state of the DFA on sym 1 … sym n  Detail: The bottom of the stack is  any,start  where any is any dummy state start is the start state of the DFA

53 CPSC460053 Goto Table  Define Goto[i,A] = j if state i  A state j where A is a nonterminal  Goto is just the transition function of the DFA One of two parsing tables

54 CPSC460054 Parser Moves  Shift x Push a, x on the stack a is current input x is a DFA state  Reduce A   As before  Accept  Error

55 CPSC460055 Action Table For each state s i and terminal a If s i has item X  .a  and there is a transition on terminal a from state i to state j then Action[i,a] = shift j If s i has item X  . and a  Follow(X) and X != S’ then Action[i,a] = reduce X   If s i has item S ’  S. then action[i,$] = accept Otherwise, action[i,a] = error

56 CPSC460056 SLR Parsing Algorithm Let I = w$ be initial input Let j = 0 Let DFA state 1 have item S ’ .S Let stack =  dummy, 1  repeat case action[top_state(stack),I[j]] of shift k: push  I[j++], k  reduce X  A: pop |A| pairs, I[--j] = X // prepend X to input accept: halt normally error: halt and report error

57 CPSC460057 Notes on SLR Parsing Algorithm  Note that the algorithm uses only the DFA states and the input The stack symbols are never used!  However, we still need the symbols for semantic actions

58 CPSC460058 The Compiler So Far  Lexical analysis Detects inputs with illegal tokens  Parsing Detects inputs with ill-formed parse trees  Semantic analysis Last “ front end ” phase Catches all remaining errors

59 CPSC460059 Typical Semantic Errors multiple declarations: a variable should be declared (in the same scope) at most once undeclared variable: a variable should not be used before being declared. type mismatch: type of the left-hand side of an assignment should match the type of the right-hand side. wrong arguments: methods should be called with the right number and types of arguments.

60 CPSC460060 Sample Semantic Analyzer For each scope in the program:  process the declarations add new entries to the symbol table (or a similar structure) and report any variables that are multiply declared  process the statements find uses of undeclared variables,  use the symbol-table information to determine the type of each expression, and to find type errors.

61 CPSC460061 Scope Rules for Pascal- Rule 6.1: All constants, types, variables, and procedures definedin the same block must have different names Rule 6.2: A constant, type, or variable defined in a block is normallyknown from the end of its declaration to the end of the block. A procedure defined in a block B is normally known from the beginning of the procedure to the end of the block B Rule 6.3: Consider a block Q that defines an object x. If Q contains a block R that defines another object named x, the first object is unknown in the scope of the second object.

62 CPSC460062 Pascal- Program (1) { 0 Begin Standard Block} 1 program P; 2 type T = array[1..100] of integer; 3 var x: T; 4 5 procedure Q(x: integer); 6 const c = 13; 7 begin... x... end{Q}; 8 9 procedure R; 10 var b, c: Boolean; 11 begin... x...end{R}; 12 13 begin... end.{P} 14 {End Standard block}

63 CPSC460063 Pascal- Program (2) {Constant = Numeral | ConstantName.} procedure Constant(Stop: Symbols); begin if Symbol = Numeral1 then Expect(Numeral, Stop) else if Symbol = Name1 then begin Find(Argument); Expect(Name1, Stop) end else SyntaxError(Stop) end;

64 CPSC460064 Pascal- Program (3) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Symbols[Semicolon1] + Stop); Define(Name); Expect(Semicolon1, Stop) end;

65 CPSC460065 Pascal- Program (4) {Program = 'program' ProgramName ';' BlockBody '.'} procedure Programx(Stop: Symbols); begin Expect(Program1, Symbols[Name1, Semicolon1, Period1] + BlockSymbols + Stop); Expect(Name1, Symbols[Semicolon1, Period1] + BlockSymbols + Stop); Expect(Semicolon1, Symbols[Period1] + BlockSymbols + Stop); NewBlock; BlockBody(Symbols[Period1] + Stop); EndBlock; Expect(Period1, Stop) end;

66 CPSC460066 Pascal- Program (5-1) {Constant = Numeral | ConstantName.} procedure Constant(var Value: integer; var Typex: Pointer; Stop: Symbols); begin if Symbol = Numeral1 then begin Value := Argument; Typex := TypeInteger; Expect(Numeral, Stop) end else if Symbol = Name1 then begin Find(Argument, Object); if Object@.Kind = Constantx then begin Value := Object@.ConstValue; Typex := Object@.ConstType; end

67 CPSC460067 Pascal- Program (5-2) else begin KindError(object); Value := 0; Typex := TypeUniversal; end; Expect(Name1, Stop) end else begin SyntaxError(Stop); Value := 0; Typex := TypeUniversal; end;

68 CPSC460068 Pascal- Program (6) {ConstantDefinition = ConstantName '=' Constant ';'.} procedure ConstantDefinition(stop: Symbols); var Name, Value: integer; Constx, Typex: Pointer; begin ExpectName(Name, Symbols[Equal1, Semicolon1] + ConstantSymbols + Stop); Expect(Equal1, ConstantSymbols + Symbols[Semicolon1] + Stop); Constant(Value, Typex, Symbols[Semicolon1] + Stop); Define(Name, Constantx, Constx); Constx@.ConstValue := Value; Constx@.ConstType := Typex; Expect(Semicolon1, Stop) end;

69 CPSC460069 Static and Dynamic Scope #include int main() { int x = 1; char x = ‘ b ’ ; char y = ‘ a ’ ; q(); void p() { return 0 double x = 2.5; } printf( “ %c\n ”,y}; { int y[10]; } } void q() { int y = 42; printf(%d\n ”, x); p(); }


Download ppt "CPSC46001 Bottom-up Parsing Reading Sections 4.5 and 4.7 from ASU."

Similar presentations


Ads by Google