Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers.

Similar presentations


Presentation on theme: "Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers."— Presentation transcript:

1 Recap Roman Manevich Mooly Sagiv

2 Outline Subjects Studied Questions & Answers

3 input –program text (file) output –sequence of tokens Read input file Identify language keywords and standard identifiers Handle include files and macros Count line numbers Remove whitespaces Report illegal symbols [Produce symbol table] Lexical Analysis (Scanning)

4 Summary For most programming languages lexical analyzers can be easily constructed automatically Exceptions: –Fortran –PL/1 Lex/Flex/Jlex are useful beyond compilers

5 input –Sequence of tokens output –Abstract Syntax Tree Report syntax errors unbalanced parenthesizes [Create “symbol-table” ] [Create pretty-printed version of the program] In some cases the tree need not be generated (one-pass compilers) Syntax Analysis (Parsing)

6 Pushdown Automaton control parser-table input stack $ $utw V

7 Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars cup context free grammar tokens parser “Ambiguity errors” parse tree

8 Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production –Preorder tree traversal Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order –For every potential right hand side and token decide when a production is found –Postorder tree traversal

9 Top-Down Parsing 1 t 1 t 2 input 5 4 32

10 Bottom-Up Parsing t 1 t 2 t 4 t 5 t 6 t 7 t 8 input 1 2 3

11 Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

12 Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

13 static int Parse_Expression(Expression **expr_p) { Expression *expr = *expr_p = new_expression() ; /* try to parse a digit */ if (Token.class == DIGIT) { expr->type=‘D’; expr->value=Token.repr –’0’; get_next_token(); return 1; } /* try parse parenthesized expression */ if (Token.class == ‘(‘) { expr->type=‘P’; get_next_token(); if (!Parse_Expression(&expr->left)) Error(“missing expression”); if (!Parse_Operator(&expr->oper)) Error(“missing operator”); if (Token.class != ‘)’) Error(“missing )”); get_next_token(); return 1; } return 0; }

14 Parsing Expressions Try every alternative production –For P  A 1 A 2 … A n | B 1 B 2 … B m –If A 1 succeeds Call A 2 If A 2 succeeds –Call A 3 If A 2 fails report an error –Otherwise try B 1 Recursive descent parsing Can be applied for certain grammars Generalization: LL1 parsing

15 int P(...) { /* try parse the alternative P  A 1 A 2... A n */ if (A 1 (...)) { if (!A 2 ()) Error(“Missing A 2 ”); if (!A 3 ()) Error(“Missing A 3 ”);.. if (!A n ()) Error(Missing A n ”); return 1; } /* try parse the alternative P  B 1 B 2... B m */ if (B 1 (...)) { if (!B 2 ()) Error(“Missing B 2 ”); if (!B 3 ()) Error(“Missing B 3 ”);.. if (!B m ()) Error(Missing B m ”); return 1; } return 0;

16 Predictive Parser for Arithmetic Expressions Grammar C-code? 1E  E + T 2E  T 3T  T * F 4T  F 5 F  id 6 F  (E)

17 Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) –For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

18 Constructing LR(0) parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” States are set of items A   –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations –Determine reduce operations –Report an error when conflicts arise

19 1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i

20 1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i Parsing “ (i)$ ”

21 Summary (Bottom-Up) LR is a powerful technique Generates efficient parsers Generation tools exit LALR(1) –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser

22 Summary (Parsing) Context free grammars provide a natural way to define the syntax of programming languages Ambiguity may be resolved Predictive parsing is natural –Good error messages –Natural error recovery –But not expressive enough But LR bottom-up parsing is more expressible

23 Abstract Syntax Intermediate program representation Defines a tree - Preserves program hierarchy Generated by the parser Declared using an (ambiguous) context free grammar (relatively flat) –Not meant for parsing Keywords and punctuation symbols are not stored (Not relevant once the tree exists) Big programs can be also handled (possibly via virtual memory)

24 Semantic Analysis Requirements related to the “context” in which a construct occurs Examples –Name resolution –Scoping –Type checking –Escape Implemented via AST traversals Guides subsequent compiler phases

25 Abstract Interpretation Static analysis Automatically identify program properties –No user provided loop invariants Sound but incomplete methods –But can be rather precise Non-standard interpretation of the program operational semantics Applications –Compiler optimization –Code quality tools Identify potential bugs Prove the absence of runtime errors Partial correctness

26 /* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;

27 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;      

28 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}    

29 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}   

30 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b}  

31 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} 

32 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} {c, a} {c, b}

33 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c, a} {c, b} {c, a} {c, b}

34 Summary Iterative Procedure Analyze one procedure at a time –More precise solutions exit Construct a control flow graph for the procedure Initializes the values at every node to the most optimistic value Iterate until convergence

35 Basic Compiler Phases

36 Overall Structure

37 Techniques Studied Simple code generation Basic blocks Global register allocation Activation records Object Oriented Assembler/Linker/Loader

38 Two Phase Solution Dynamic Programming Sethi & Ullman Bottom-up (labeling) –Compute for every subtree The minimal number of registers needed (weight) Top-Down –Generate the code using labeling by preferring “heavier” subtrees (larger labeling)

39 “Global” Register Allocation Input: –Sequence of machine code instructions (assembly) Unbounded number of temporary registers Output –Sequence of machine code instructions (assembly) –Machine registers –Some MOVE instructions removed –Missing prologue and epilogue

40 Graph Coloring with Coalescing Build: Construct the interference graph Simplify: Recursively remove non MOVE nodes with less than K neighbors; Push removed nodes into stack Potential-Spill: Spill some nodes and remove nodes Push removed nodes into stack Select: Assign actual registers (from simplify/spill stack) Actual-Spill: Spill some potential spills and repeat the process Coalesce: Conservatively merge unconstrained MOV related nodes with fewer than K “heavy” neighbors Freeze: Give-Up Coalescing on some low-degree MOV related nodes

41 A Typical Stack Frame higher addresses previous frame current frame lexical pointer argument 1 argument 2 dynamic link return address temporaries argument 2 argument 1 outgoing parameters lower addresses next frame frame size frame pointer stack pointer outgoing parameters registers locals administrative

42 Heap Memory Management Part of the runtime system Utilities for dynamic memory allocation Utilities for automatic memory reclamation –Garbage Colletion

43 Garbage Collection Techniques –Mark and sweep –Copying collection –Reference counting Modes –Generational –Incremental vs. Stop the world

44 Accessing stack variables Use offset from EBP Remember – stack grows downwards Above EBP = parameters Below EBP = locals Examples –%ebp + 4 = return address –%ebp + 8 = first parameter –%ebp – 4 = first local …… SP FP Return address Param n … param1 Local 1 … Local n Previous fp Param n … param1 FP+8 FP-4

45 argv argc return address In main before foo(argv[2]) fp sp 1000 996 992 988 984 980 989 988 987 983 979 975 abcdefgh0 data segment 5000

46 argv argc return address 5000 return address buf[2] buf[1] buf[0] inside foo(argv[2]) fp sp 1000 996 992 988 984 980 989 988 987 983 979 975 abcdefgh0 data segment 5000

47 argv argc return address 5000 return address buf[2] buf[1] buf[0] str : 5000 buf : 988 before strcpy fp sp 1000 996 992 988 984 980 989 988 987 983 979 975 abcdefgh0 data segment 5000

48 argv argc return address h 0 X X return address d f g buf[2]=c buf[1]=b buf[0]=a str : 5000 buf : 988 return address previous fp inside strcpy sp fp 1000 996 992 988 984 980 989 988 987 983 979 975 abcdefgh0 data segment 5000

49 argv argc return address h 0 X X return address d f g buf[2]=c buf[1]=b buf[0]=a str : 5000 buf : 988 return from strcpy fp sp 1000 996 992 988 984 980 989 988 987 983 979 975 abcdefgh0 data segment 5000

50 argv argc return address h 0 X X return address d f g buf[2]=c buf[1]=b buf[0]=a Return from foo where to? 1000 996 992 988 984 980 989 988 987 983 979 975 fp sp abcdefgh0 data segment 5000


Download ppt "Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers."

Similar presentations


Ads by Google