Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.

Similar presentations


Presentation on theme: "Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers."— Presentation transcript:

1 Recap Mooly Sagiv

2 Outline Subjects Studied Questions & Answers

3 input –program text (file) output –sequence of tokens Read input file Identify language keywords and standard identifiers Handle include files and macros Count line numbers Remove whitespaces Report illegal symbols [Produce symbol table] Lexical Analysis (Scanning)

4 The Lexical Analysis Problem Given –A set of token descriptions –An input string Partition the strings into tokens (class, value) Ambiguity resolution –The longest matching token –Between two equal length tokens select the first

5 Jlex Input – regular expressions and actions (Java code) Output – A scanner program that reads the input and applies actions when input regular expression is matched Jlex regular expressions input program tokens scanner

6 Summary For most programming languages lexical analyzers can be easily constructed automatically Exceptions: –Fortran –PL/1 Lex/Flex/Jlex are useful beyond compilers

7 input –Sequence of tokens output –Abstract Syntax Tree Report syntax errors unbalanced parenthesizes [Create “symbol-table” ] [Create pretty-printed version of the program] In some cases the tree need not be generated (one-pass compilers) Syntax Analysis (Parsing)

8 Pushdown Automaton control parser-table input stack $ $utw V

9 Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars cup context free grammar tokens parser “Ambiguity errors” parse tree

10 Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production –Preorder tree traversal Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order –For every potential right hand side and token decide when a production is found –Postorder tree traversal

11 Top-Down Parsing 1 t 1 t 2 input 5 4 32

12 Bottom-Up Parsing t 1 t 2 t 4 t 5 t 6 t 7 t 8 input 1 2 3

13 Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

14 Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

15 static int Parse_Expression(Expression **expr_p) { Expression *expr = *expr_p = new_expression() ; /* try to parse a digit */ if (Token.class == DIGIT) { expr->type=‘D’; expr->value=Token.repr –’0’; get_next_token(); return 1; } /* try parse parenthesized expression */ if (Token.class == ‘(‘) { expr->type=‘P’; get_next_token(); if (!Parse_Expression(&expr->left)) Error(“missing expression”); if (!Parse_Operator(&expr->oper)) Error(“missing operator”); if (Token.class != ‘)’) Error(“missing )”); get_next_token(); return 1; } return 0; }

16 Parsing Expressions Try every alternative production –For P  A 1 A 2 … A n | B 1 B 2 … B m –If A 1 succeeds Call A 2 If A 2 succeeds –Call A 3 If A 2 fails report an error –Otherwise try B 1 Recursive descent parsing Can be applied for certain grammars Generalization: LL1 parsing

17 int P(...) { /* try parse the alternative P  A 1 A 2... A n */ if (A 1 (...)) { if (!A 2 ()) Error(“Missing A 2 ”); if (!A 3 ()) Error(“Missing A 3 ”);.. if (!A n ()) Error(Missing A n ”); return 1; } /* try parse the alternative P  B 1 B 2... B m */ if (B 1 (...)) { if (!B 2 ()) Error(“Missing B 2 ”); if (!B 3 ()) Error(“Missing B 3 ”);.. if (!B m ()) Error(Missing B m ”); return 1; } return 0;

18 Predictive Parser for Arithmetic Expressions Grammar C-code? 1E  E + T 2E  T 3T  T * F 4T  F 5 F  id 6 F  (E)

19 Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) –For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

20 Constructing LR(0) parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” States are set of items A   –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations –Determine reduce operations –Report an error when conflicts arise

21 1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i

22 1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i Parsing “ (i)$ ”

23 Summary (Bottom-Up) LR is a powerful technique Generates efficient parsers Generation tools exit LALR(1) –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser

24 Summary (Parsing) Context free grammars provide a natural way to define the syntax of programming languages Ambiguity may be resolved Predictive parsing is natural –Good error messages –Natural error recovery –But not expressive enough But LR bottom-up parsing is more expressible

25 Abstract Syntax Intermediate program representation Defines a tree - Preserves program hierarchy Generated by the parser Declared using an (ambiguous) context free grammar (relatively flat) –Not meant for parsing Keywords and punctuation symbols are not stored (Not relevant once the tree exists) Big programs can be also handled (possibly via virtual memory)

26 Semantic Analysis Requirements related to the “context” in which a construct occurs Examples –Name resolution –Scoping –Type checking –Escape Implemented via AST traversals Guides subsequent compiler phases

27 Abstract Interpretation Static analysis Automatically identify program properties –No user provided loop invariants Sound but incomplete methods –But can be rather precise Non-standard interpretation of the program operational semantics Applications –Compiler optimization –Code quality tools Identify potential bugs Prove the absence of runtime errors Partial correctness

28 Constant Propagation z =3 while (x>0) if (x=1) y =7y =z+4 assert y==7 [x  ?, y  ?, z  ? ] [x  ?, y  ?, z  3 ] [x  1, y  ?, z  3 ] [x  1, y  7, z  3 ] [x  ?, y  7, z  3 ] [x  ?, y  ?, z  3 ]

29 /* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;

30 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;      

31 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}    

32 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}   

33 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b}  

34 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} 

35 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} {c, a} {c, b}

36 a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c, a} {c, b} {c, a} {c, b}

37 Summary Iterative Procedure Analyze one procedure at a time –More precise solutions exit Construct a control flow graph for the procedure Initializes the values at every node to the most optimistic value Iterate until convergence

38 Basic Compiler Phases

39 Overall Structure

40 Techniques Studied Simple code generation Basic blocks Global register allocation Activation records Object Oriented Assembler/Linker/Loader

41 Heap Memory Management Part of the runtime system Utilities for dynamic memory allocation Utilities for automatic memory reclamation –Garbage Colletion

42 Garbage Collection Techniques –Mark and sweep –Copying collection –Reference counting Modes –Generational –Incremental vs. Stop the world


Download ppt "Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers."

Similar presentations


Ads by Google