Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.

Slides:



Advertisements
Similar presentations
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Advertisements

Abstract Syntax Mooly Sagiv html:// 1.
Recap Roman Manevich Mooly Sagiv. Outline Subjects Studied Questions & Answers.
Lecture # 7 Chapter 4: Syntax Analysis. What is the job of Syntax Analysis? Syntax Analysis is also called Parsing or Hierarchical Analysis. A Parser.
Mooly Sagiv and Roman Manevich School of Computer Science
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Program analysis Mooly Sagiv html://
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS Summer 2005 Top-down and Bottom-up Parsing - a whirlwind tour June 20, 2005 Slide acknowledgment: Radu Rugina, CS 412.
Context-Free Grammars Lecture 7
Program analysis Mooly Sagiv html://
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Tentative Schedule 20/12 Interpreter+ Code Generation 27/12 Code Generation for Control Flow 3/1 Activation Records 10/1 Program Analysis 17/1 Register.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Compiler Summary Mooly Sagiv html://
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Overview of program analysis Mooly Sagiv html://
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Chapter 2 A Simple Compiler
Abstract Syntax Mooly Sagiv html://
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1.
Compiler construction in4020 – lecture 3 Koen Langendoen Delft University of Technology The Netherlands.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
1 October 2, October 2, 2015October 2, 2015October 2, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
CISC 471 First Exam Review Game Questions. Overview 1 Draw the standard phases of a compiler for compiling a high level language to machine code, showing.
Lexical Analysis Mooly Sagiv Schrierber Wed 10:00-12:00 html:// Textbook:Modern.
Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Introduction to Compiling
Introduction CPSC 388 Ellen Walker Hiram College.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
COMP 3438 – Part II-Lecture 6 Syntax Analysis III Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Compiler Summary Lexical analysis—scanner – String of characters  token – Finite automata. Syntactical analysis – parser – Sequence of tokens –> language.
Introduction to Compiler Construction
A Simple Syntax-Directed Translator
Programming Languages Translator
Textbook:Modern Compiler Design
Table-driven parsing Parsing performed by a finite state machine.
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Lexical and Syntax Analysis
CPSC 388 – Compiler Design and Construction
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Subject: Language Processor
Exam Topics Hal Perkins Autumn 2009
Kanat Bolazar February 16, 2010
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
COMPILER CONSTRUCTION
Exam Topics Hal Perkins Winter 2008
Presentation transcript:

Recap Mooly Sagiv

Outline Subjects Studied Questions & Answers

input –program text (file) output –sequence of tokens Read input file Identify language keywords and standard identifiers Handle include files and macros Count line numbers Remove whitespaces Report illegal symbols [Produce symbol table] Lexical Analysis (Scanning)

The Lexical Analysis Problem Given –A set of token descriptions –An input string Partition the strings into tokens (class, value) Ambiguity resolution –The longest matching token –Between two equal length tokens select the first

Jlex Input – regular expressions and actions (Java code) Output – A scanner program that reads the input and applies actions when input regular expression is matched Jlex regular expressions input program tokens scanner

Summary For most programming languages lexical analyzers can be easily constructed automatically Exceptions: –Fortran –PL/1 Lex/Flex/Jlex are useful beyond compilers

input –Sequence of tokens output –Abstract Syntax Tree Report syntax errors unbalanced parenthesizes [Create “symbol-table” ] [Create pretty-printed version of the program] In some cases the tree need not be generated (one-pass compilers) Syntax Analysis (Parsing)

Pushdown Automaton control parser-table input stack $ $utw V

Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars cup context free grammar tokens parser “Ambiguity errors” parse tree

Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production –Preorder tree traversal Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order –For every potential right hand side and token decide when a production is found –Postorder tree traversal

Top-Down Parsing 1 t 1 t 2 input

Bottom-Up Parsing t 1 t 2 t 4 t 5 t 6 t 7 t 8 input 1 2 3

Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

Example Grammar for Predictive LL Top- Down Parsing expression  digit | ‘(‘ expression operator expression ‘)’ operator  ‘+’ | ‘*’ digit  ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’

static int Parse_Expression(Expression **expr_p) { Expression *expr = *expr_p = new_expression() ; /* try to parse a digit */ if (Token.class == DIGIT) { expr->type=‘D’; expr->value=Token.repr –’0’; get_next_token(); return 1; } /* try parse parenthesized expression */ if (Token.class == ‘(‘) { expr->type=‘P’; get_next_token(); if (!Parse_Expression(&expr->left)) Error(“missing expression”); if (!Parse_Operator(&expr->oper)) Error(“missing operator”); if (Token.class != ‘)’) Error(“missing )”); get_next_token(); return 1; } return 0; }

Parsing Expressions Try every alternative production –For P  A 1 A 2 … A n | B 1 B 2 … B m –If A 1 succeeds Call A 2 If A 2 succeeds –Call A 3 If A 2 fails report an error –Otherwise try B 1 Recursive descent parsing Can be applied for certain grammars Generalization: LL1 parsing

int P(...) { /* try parse the alternative P  A 1 A 2... A n */ if (A 1 (...)) { if (!A 2 ()) Error(“Missing A 2 ”); if (!A 3 ()) Error(“Missing A 3 ”);.. if (!A n ()) Error(Missing A n ”); return 1; } /* try parse the alternative P  B 1 B 2... B m */ if (B 1 (...)) { if (!B 2 ()) Error(“Missing B 2 ”); if (!B 3 ()) Error(“Missing B 3 ”);.. if (!B m ()) Error(Missing B m ”); return 1; } return 0;

Predictive Parser for Arithmetic Expressions Grammar C-code? 1E  E + T 2E  T 3T  T * F 4T  F 5 F  id 6 F  (E)

Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) –For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program

Constructing LR(0) parsing table Add a production S’  S$ Construct a finite automaton accepting “valid stack symbols” States are set of items A   –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations –Determine reduce operations –Report an error when conflicts arise

1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i

1: S   E$ 4: E   T 6: E   E + T 10: T   i 12: T   (E) 5: E  T  T 11: T  i  i 2: S  E  $ 7: E  E  + T E 13: T  (  E) 4: E   T 6: E   E + T 10: T   i 12: T   (E) ( ( 15: T  (E)  ) 14: T  (E  ) 7: E  E  + T E 7: E  E +  T 10: T   i 12: T   (E) + + 8: E  E + T  T 2: S  E $  $ i i Parsing “ (i)$ ”

Summary (Bottom-Up) LR is a powerful technique Generates efficient parsers Generation tools exit LALR(1) –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser

Summary (Parsing) Context free grammars provide a natural way to define the syntax of programming languages Ambiguity may be resolved Predictive parsing is natural –Good error messages –Natural error recovery –But not expressive enough But LR bottom-up parsing is more expressible

Abstract Syntax Intermediate program representation Defines a tree - Preserves program hierarchy Generated by the parser Declared using an (ambiguous) context free grammar (relatively flat) –Not meant for parsing Keywords and punctuation symbols are not stored (Not relevant once the tree exists) Big programs can be also handled (possibly via virtual memory)

Semantic Analysis Requirements related to the “context” in which a construct occurs Examples –Name resolution –Scoping –Type checking –Escape Implemented via AST traversals Guides subsequent compiler phases

Abstract Interpretation Static analysis Automatically identify program properties –No user provided loop invariants Sound but incomplete methods –But can be rather precise Non-standard interpretation of the program operational semantics Applications –Compiler optimization –Code quality tools Identify potential bugs Prove the absence of runtime errors Partial correctness

Constant Propagation z =3 while (x>0) if (x=1) y =7y =z+4 assert y==7 [x  ?, y  ?, z  ? ] [x  ?, y  ?, z  3 ] [x  1, y  ?, z  3 ] [x  1, y  7, z  3 ] [x  ?, y  7, z  3 ] [x  ?, y  ?, z  3 ]

/* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;      

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}    

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c}   

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b}  

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} 

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c} {c, b} {c, a} {c, b}

a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;  {c, a} {c, b} {c, a} {c, b}

Summary Iterative Procedure Analyze one procedure at a time –More precise solutions exit Construct a control flow graph for the procedure Initializes the values at every node to the most optimistic value Iterate until convergence

Basic Compiler Phases

Overall Structure

Techniques Studied Simple code generation Basic blocks Global register allocation Activation records Object Oriented Assembler/Linker/Loader

Heap Memory Management Part of the runtime system Utilities for dynamic memory allocation Utilities for automatic memory reclamation –Garbage Colletion

Garbage Collection Techniques –Mark and sweep –Copying collection –Reference counting Modes –Generational –Incremental vs. Stop the world