Elaboration or: Semantic Analysis Compiler Baojian Hua

Slides:



Advertisements
Similar presentations
Semantic Analysis and Symbol Tables
Advertisements

Symbol Table.
Data Structure & Abstract Data Type
1 Compiler Construction Intermediate Code Generation.
Winter Compiler Construction T7 – semantic analysis part II type-checking Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv.
Program Representations. Representing programs Goals.
Elaboration or: Semantic Analysis Compiler Baojian Hua
Compiler Construction
The Symbol Table Lecture 13 Wed, Feb 23, The Symbol Table When identifiers are found, they will be entered into a symbol table, which will hold.
CPSC Compiler Tutorial 9 Review of Compiler.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Abstract Syntax Trees Compiler Baojian Hua
CS 312 Spring 2004 Lecture 18 Environment Model. Substitution Model Represents computation as doing substitutions for bound variables at reduction of.
Tutorial 6 & 7 Symbol Table
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
Context-Free Grammars Lecture 7
CS 312 Spring 2002 Lecture 16 The Environment Model.
Environments and Evaluation
Code Generation Compiler Baojian Hua
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Symbol Table (  ) Contents Map identifiers to the symbol with relevant information about the identifier All information is derived from syntax tree -
COP4020 Programming Languages
1 Week 4 Questions / Concerns Comments about Lab1 What’s due: Lab1 check off this week (see schedule) Homework #3 due Wednesday (Define grammar for your.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
COMPILERS Semantic Analysis hussein suleman uct csc3005h 2006.
CSC 338: Compiler design and implementation
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
COMPILERS Symbol Tables hussein suleman uct csc3003s 2007.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
1 Scope Scope describes the region where an identifier is known, and semantic rules for this.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
CPS 506 Comparative Programming Languages Syntax Specification.
Abstract Syntax Trees Compiler Baojian Hua
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
CSE 5317/4305 L6: Semantic Analysis1 Semantic Analysis Leonidas Fegaras.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
Bernd Fischer RW713: Compiler and Software Language Engineering.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Lexical Analysis – Part II EECS 483 – Lecture 3 University of Michigan Wednesday, September 13, 2006.
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
CS412/413 Introduction to Compilers Radu Rugina Lecture 11: Symbol Tables 13 Feb 02.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
COMPILERS Semantic Analysis hussein suleman uct csc3003s 2009.
Lecture 9 Symbol Table and Attributed Grammars
Compiler Design (40-414) Main Text Book:
Names and Attributes Names are a key programming language feature
Constructing Precedence Table
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CS 326 Programming Languages, Concepts and Implementation
Semantic Analysis with Emphasis on Name Analysis
CS 536 / Fall 2017 Introduction to programming languages and compilers
Basic Program Analysis: AST
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
CSE401 Introduction to Compiler Construction
Abstract Syntax Prabhaker Mateti 1.
Symbol Table 薛智文 (textbook ch#2.7 and 6.5) 薛智文 96 Spring.
COMPILERS Semantic Analysis
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Presentation transcript:

Elaboration or: Semantic Analysis Compiler Baojian Hua

Front End source code abstract syntax tree lexical analyzer parser tokens IR semantic analyzer

Elaboration Also known as type-checking, or semantic analysis context-sensitive analysis Checking the well-formedness of programs: every variable is declared before use every expression has a proper type function calls conform to definitions all other possible context-sensitive info’ (highly language-dependent) … translate AST into intermediate or machine code

Elaboration Example void f (int *p) { x += 4; p (23); “hello” + “world”; } int main () { f () + 5; } What errors can be detected here?

Terminology Scope Lifetime Storage class Name space

Terminologies: Scope int x; int f () { if (4) { int x; x = 6; } else { int x; x = 5; } x = 8; }

Terminologies: Lifetime static int x; int f () { int x, *p; x = 6; p = malloc (sizeof (*p)); if (3) { static int x; x = 5; }

Terminologies: Storage class extern int x; int f () { extern int x; x = 6; if (3) { extern int x; x = 5; }

Terminologies: Name space struct list { int x; struct list *list; } *list; void walk (struct list *list) { list: printf (“%d\n”, list->x); if (list = list->list) goto list; }

Moral For the purpose of elaboration, must take care of all of this TOGETHER Scope Life time Storage class Name space … All these details are handled by symbol tables!

Symbol Tables In order to keep track of the types and other infos ’ we ’ d maintain a finite map of program symbols to info ’ symbols: variables, function names, etc. Such a mapping is called a symbol table, or sometimes an environment Notation: {x1: t1, x2: t2, …, xn: tn} where xi: ti (1 ≤ i ≤ n) is called a binding

Scope How to handle lexical scope? It ’ s easy, we just insert and remove bindings during elaboration, as we enters and leaves a local scope

Scope int x; σ={x:int} int f () σ1 = σ + {f:…} = {x:int, f:…} { if (4) { int x; σ2 = σ1 + {x:int} = {x:…, f:…, x:…} x = 6; } σ1 else { int x; σ4 = σ1 + {x:int} = {x:…, f:…, x:…} x = 5; } σ1 x = 8; } σ1 Shadowing: “ + ” is not commutative!

Implementation Must be efficient! lots of variables, functions, etc Two basic approaches: Functional symbol table is implemented as a functional data structure (e.g., red-black tree), with no tables ever destroyed or modified Imperative a single table, modified for every binding added or removed This choice is largely independent of the implementation language

Functional Symbol Table Basic idea: when implementing σ2 = σ1 + {x:t} creating a new table σ2, instead of modifying σ1 when deleting, restore to the old table A good data structure for this is BST or red-black tree

BST Symbol Table c: int a: char b: double  e: int c: int ’’

Possible Functional Interface signature SYMBOL_TABLE = sig type ‘a t type key val empty: ‘a t val insert: ‘a t * key * ‘a -> ‘a t val lookup: ‘a t * key -> ‘a option end

Imperative Symbol Tables The imperative approach almost always involves the use of hash tables Need to delete entries to revert to previous environment made simpler because deletes follow a stack discipline can maintain a stack of entered symbols, so that they can be later popped and removed from the hash table

Possible Imperative Interface signature SYMBOL_TABLE = sig type ‘a t type key val insert: ‘a t * key * ‘a -> unit val lookup: ‘a t * key -> ‘a option val delete: ‘a t * key -> unit val beginScope: unit -> unit val endScope: unit -> unit end

Name Space It ’ s trivial to handle name space one symbol table for each name space Take C as an example: Several different name spaces labels tags variables So …

Implementation of Symbols For several reasons, it will be useful at some point to represent symbols as elements of a small, densely packed set of identities fast comparisons (equality) for dataflow analysis, we will want sets of variables and fast set operations It will be critically important to use bit strings to represent the sets For example, your liveness analysis algorithm More on this later

Types The representation of types is highly language-dependent Some key considerations: name vs. structural equivalence mutually recursive type definitions dealing with errors

Name vs. Structural Equivalence In a language with structural equivalence, this program is legal But not in a language with name equivalence (e.g., C) For name equivalence, can generate a unique symbol for each defined type For structural equivalence, need to recursively compare the types struct A { int i; } x; struct B { int i; } y; x = y;

Mutually recursive type definitions To process recursive and mutually recursive type definitions, need a placeholder in ML, an option ref in C, a pointer in Java, bind method (read Appel) struct A { int data; struct A *next; struct B *b; }; struct B {…};

Error Diagnostic To recover from errors, it is useful to have an “ any ” type makes it possible to continue more type- checking In practice, use “ int ” or guess one Similarly, a “ void ” type can be used for expressions that return no value Source locations are annotated in AST!

Organization of the Elaborator Module structure: elabProg: Ast.Program.t -> unit elabStm: Ast.Stm.t * tenv * venv -> unit elabDec: Ast.Dec.t * venv * tenv-> tenv * venv elabTy: Ast.Type.t * tenv -> ty elabExp: Ast.Exp.t * venv-> ty elabLVal: Ast.Lval.t * venv-> ty It will be extended to also do translation. For now let ’ s concentrate on type- checking

Elaborate Expressions Checks that expressions are correctly typed. Valid expressions are defined in the C specification. e: t means that e is a valid expression of type t. venv is a symbol table (environment).

Elaborate Expressions fun elabExp (e, venv) = case e of BinaryExp (PLUS, e1, e2) => let val t1 = elabExp (e1, env) val t2 = elabExp (e2, env) in case (t1, t2) of (Int, Int) => Int | (Int, _) => error (“e2 should be int”) | (_, Int) => error (“e1 should be int”) | _ => error (“should both be int”) end venv  |  e1: intvenv  |  e2: int venv  |  e1+e2: int

Elaborate Types Elaborating types is straightforward, except for recursive types Need to do “ knot-tying ” : extend tenv with bindings for all of the new type names bind new names to “ dummy ” bodies process each definition, replacing the dummy bodies with real definitions

Elaborate Declarations elabDec will extend the symbol tables with a new binding: int a; will add {a: int} to the environment. Remember that environments have to take into account scope of variables!

Elaborate Statement, Lvals, Programs All follow the same structures as exp or types elabProg calls the other functions in order to type-check each component of the program (declarations, statements, expressions, … )

Labs For lab #4, your job is to implement an elaborator for C-- you may go in two steps first type-checking and then generating target code At every step, check the output carefully to make sure your compiler works correctly

Summary Elaboration checks the well-formedness of programs must take care of semantics of source programs and may translate into more low-level forms Usually the most big (complex) part in a compiler!