Semantic Analysis. - 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses.

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Semantic Analysis and Symbol Tables
Semantics Static semantics Dynamic semantics attribute grammars
Intermediate Code Generation
- Vasvi Kakkad.  Formal -  Tool for mathematical analysis of language  Method for precisely designing language  Well formed model for describing and.
Semantic Analysis Chapter 6. Two Flavors  Static (done during compile time) –C –Ada  Dynamic (done during run time) –LISP –Smalltalk  Optimization.
Programming Languages and Paradigms
Programming Languages and Paradigms The C Programming Language.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
Winter Compiler Construction T7 – semantic analysis part II type-checking Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv.
Overview of Previous Lesson(s) Over View  Front end analyzes a source program and creates an intermediate representation from which the back end generates.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Type Checking.
Compiler Construction
ALGOL 60 Design by committee of computer scientists: Naur, Backus, Bauer, McCarthy, van Wijngaarden, Landin, etc. Design by committee of computer scientists:
Functional programming: LISP Originally developed for symbolic computing First interactive, interpreted language Dynamic typing: values have types, variables.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
Compiler Construction Semantic Analysis I Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Programming Languages 2nd edition Tucker and Noonan Chapter 6 Types I was eventually persuaded of the need to design programming notations so as to maximize.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
Semantic Analysis CS 671 February 5, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens –e.g.: main$
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
5-1 Chapter 5: Names, Bindings, Type Checking, and Scopes Variables The Concept of Binding Type Checking Strong Typing Type Compatibility Scope and Lifetime.
Semantics CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
March 12, ICE 1341 – Programming Languages (Lecture #6) In-Young Ko Programming Languages (ICE 1341) Lecture #6 Programming Languages (ICE 1341)
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Week 7 Questions / Concerns What’s due: Lab3 next Monday 5/19 Coming up: Lab2 & Lab3 check-off next week Lab3: LL(1) Bottom-Up Parsers LR(1) parsers.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
Chapter 3 Part II Describing Syntax and Semantics.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Semantics CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.
Structure of Programming Languages Names, Bindings, Type Checking, and Scopes.
CS536 Types 1. Roadmap Back from our LR Parsing Detour Name analysis – Static v dynamic – Scope Today – Type checking 2 Scanner Parser Tokens Semantic.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
Types and Programming Languages Lecture 11 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Semantic Analysis II. Messages Please check lecturer notices in the Moodle Appeals  Legitimate: “I don’t have the bug you mentioned…”  Illegitimate:
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
1 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Parser Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator.
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
CS412/413 Introduction to Compilers Radu Rugina Lecture 11: Symbol Tables 13 Feb 02.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Dr. M. Al-Mulhem Introduction 1 Chapter 6 Type Systems.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Lecture 9 Symbol Table and Attributed Grammars
Chapter 3 – Describing Syntax
Programming Languages and Compilers (CS 421)
A Simple Syntax-Directed Translator
Constructing Precedence Table
Semantic Analysis with Emphasis on Name Analysis
Syntax Errors; Static Semantics
CSE 3302 Programming Languages
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
CSE401 Introduction to Compiler Construction
Representation, Syntax, Paradigms, Types
Compiler Construction
Compiler Construction
COMPILER CONSTRUCTION
Presentation transcript:

Semantic Analysis

- 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses are not powerful enough to ensure the correct usage of variables, objects, functions,... v Semantic analysis: Ensure that the program satisfies a set of rules regarding the usage of programming constructs (variables, objects, expressions, statements)

- 2 - Class Problem Classify each error as lexical, syntax, semantic, or correct. int a; a = 1.0; int a; b b = a; { int a; a = 1; } { a = 2; } in a; a = 1; int foo(int a) { foo = 3; } 1int x; x = 2; int foo(int a) { a = 3; }

- 3 - Categories of Semantic Analysis v Examples of semantic rules »Variables must be defined before being used »A variable should not be defined multiple times »In an assignment stmt, the variable and the expression must have the same type »The test expr. of an if statement must have boolean type v 2 major categories »Semantic rules regarding scopes »Semantic rules regarding types

Semantic Analysis I Scope Analysis

- 5 - Scope Information v Characterizes the declaration of identifiers and the portions of the program where it is allowed to use each identifier »Example identifiers: variables, functions, objects, labels v Lexical scope: textual region in the program »Examples: Statement block, formal argument list, object body, function or method body, source file, whole program v Scope of an identifier: The lexical scope its declaration refers to

- 6 - Variable Scope v Scope of variables in statement blocks: v Scope of global variables: current file v Scope of external variables: whole program { int a;... {int b;... }.... } scope of variable a scope of variable b

- 7 - Function Parameter and Label Scope v Scope of formal arguments of functions: v Scope of labels: int foo(int n) {... } void foo() {... goto lab;... lab: i++;... goto lab;... } scope of label lab, Note in Ansi-C all labels have function scope regardless of where they are scope of argument n

- 8 - Scope in Class Declaration v Scope of object fields and methods: class A { public: void f() {x=1;}... private: int x;... } scope of variable x and method f

- 9 - Semantic Rules for Scopes v Main rules regarding scopes: »Rule 1: Use each identifier only within its scope »Rule 2: Do not declare identifier of the same kind with identical names more than once in the same lexical scope class X { int X; void X(int X) { X:... goto X; } int X(int X) { int X; goto X; { int X; X: X = 1; } Are these legal? If not, identify the illegal portion.

NAMEKINDTYPEATTRIBUTES foofuncint,int  intextern margint nargintconst tmpvarcharconst Symbol Tables v Semantic checks refer to properties of identifiers in the program – their scope or type v Need an environment to store the information about identifiers = symbol table v Each entry in the symbol table contains: »Name of an identifier »Additional info about identifier: kind, type, constant?

Scope Information v How to capture the scope information in the symbol table? v Idea: »There is a hierarchy of scopes in the program »Use similar hierarchy of symbol tables »One symbol table for each scope »Each symbol table contains the symbols declared in that lexical scope

Example int x; void f(int m) { float x, y;... {int i, j;....; } {int x; l:...; } } int g(int n) { char t;... ; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab func f symtab func g symtab

Identifiers with Same Name v The hierarchical structure of symbol tables automatically solves the problem of resolving name collisions »E.g., identifiers with the same name and overlapping scopes v To find which is the declaration of an identifier that is active at a program point: »Start from the current scope »Go up the hierarchy until you find an identifier with the same name

Class Problem int x; void f(int m) { float x, y;... {int i, j; x=1; } {int x; l: x=2; } } int g(int n) { char t; x=3; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab Associate each definition of x with its appropriate symbol table entry

Catching Semantic Errors int x; void f(int m) { float x, y;... {int i, j; x=1; } {int x; l: i=2; } } int g(int n) { char t; x=3; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab i=2 Error! undefined variable

Symbol Table Operations v Two operations: »To build symbol tables, we need to insert new identifiers in the table »In the subsequent stages of the compiler we need to access the information from the table: use lookup function v Cannot build symbol tables during lexical analysis »Hierarchy of scopes encoded in syntax v Build the symbol tables: »While parsing, using the semantic actions »After the AST is constructed

Forward References v Use of an identifier within the scope of its declaration, but before it is declared v Any compiler phase that uses the information from the symbol table must be performed after the table is constructed v Cannot type-check and build symbol table at the same time v Example class A { int m() {return n(); } int n() {return 1; } }

Semantic Analysis II Type Checking

Type Information v What are types? »They describe the values computed during the execution of the program »Essentially they are a predicate on values  E.g., “int x” in C means –2^31 <= x < 2^31 v Type Information: Describes what kind of values correspond to different constructs: variables, statements, expressions, functions, etc. »variables:int a;integer »expressions:(a+1) == 2boolean »statements:a = 1.0;floating-point »functions:int pow(int n, int m) int = int,int

Type Checking v Type Errors: improper or inconsistent operations during program execution v Type-safety: absence of type errors v Type Checking: Set of rules which ensures the type consistency of different constructs in the program

How to Ensure Type-Safety v Bind (assign) types, then check types v Type binding: defines type of constructs in the program (e.g., variables, functions) »Can be either explicit (int x) or implicit (x=1) v Type checking: determine if the program correctly uses the type bindings »Consists of a set of type-checking rules

Type Checking v Semantic checks to enforce the type safety of the program v Examples »Unary and binary operators (e.g. +, ==, [ ]) must receive operands of the proper type »Functions must be invoked with the right number and type of arguments »Return statements must agree with the return type »In assignments, assigned value must be compatible with type of variable on LHS »Class members accessed appropriately

Concepts Related to Types/Languages 1. Static vs dynamic checking »When to check types 2. Static vs dynamic typing »When to define types 3. Strong vs weak typing »How many type errors 4. Sound type systems »Statically catch all type errors

Static vs Dynamic Checking v Static type checking »Perform at compile time v Dynamic type checking »Perform at run time (as the program executes) v Examples of dynamic checking »Array bounds checking »Null pointer dereferences

Static vs Dynamic Typing v Static and dynamic typing refer to type definitions (i.e., bindings of types to variables, expressions, etc.) v Static typed language »Types defined at compile-time and do not change during the execution of the program  C, C++, Java, Pascal v Dynamically typed language »Types defined at run-time, as program executes  Lisp, Smalltalk

Strong vs Weak Typing v Refer to how much type consistency is enforced v Strongly typed languages »Guarantee accepted programs are type-safe v Weakly typed languages »Allow programs which contain type errors v These concepts refer to run-time »Can achieve strong typing using either static or dynamic typing

- 27 -

Soundness v Sound type systems: can statically ensure that the program is type-safe v Soundness implies strong typing v Static type safety requires a conservative approximation of the values that may occur during all possible executions »May reject type-safe programs »Need to be expressive: reject as few type-safe programs as possible

Class Problem Strong TypingWeak Typing Static Typing Dynamic Typing Classify the following languages: C, C++, Pascal, Java, Standard ML, Modula-3, Smalltalk C, C++, Pascal, Java, C#, Modula-3 Standard ML, Smalltalk, Lisp Javascript, PHP, Perl 5, Objective-C

Why Static Checking? v Efficient code »Dynamic checks slow down the program v Guarantees that all executions will be safe »Dynamic checking gives safety guarantees only for some execution of the program v But is conservative for sound systems »Needs to be expressive: reject few type-safe programs

Type Systems v What are types? »They describe the values computed during the execution of the program »Essentially they are a predicate on values  E.g., “int x” in C means –2^31 <= x < 2^31 v Type expressions: Describe the possible types in the program »E.g., int, char*, array[], object, etc. v Type system: Defines types for language constructs »E.g., expressions, statements

Type Expressions v Language type systems have basic types (aka: primitive types or ground types) »E.g., int, char*, double v Build type expressions using basic types: »Type constructors  Array types  Structure/object types  Pointer types »Type aliases »Function types

Semantic Analysis III Static Semantics

Static Semantics v Can describe the types used in a program v How to describe type checking v Static semantics: Formal description for the programming language v Is to type checking: »As grammar is to syntax analysis »As regular expression is to lexical analysis v Static semantics defines types for legal ASTs in the language

Type Judgments or Relations v Static semantics = formal notation which describes type judgments: »E : T »means “E is a well-typed expression of type T” »E is typable if there is some type T such that E : T v Type judgment examples: »2 : int »true : bool »2 * (3 + 4) : int »“Hello” : string

Type Judgments for Statements v Statements may be expressions (i.e., represent values) v Use type judgments for statements: »if (b) 2 else 3 : int »x == 10 : bool »b = true, y = 2 : int (result of comma operator is the value of the rightmost expression) v For statements which are not expressions: use a special unit type (void or empty type) »S : unit »means “S is a well-typed statement with no result type”

Class Problem f1 [ 3 ] i = i1 [ i2] while (i < 10) do S1 (i ? 0) 4.0 : 1.0 Whats the type of the following statements? Assume i* are int variables, f* are float variables

Deriving a Judgment v Consider the judgment »if (b) 2 else 3 : int v What do we need to decide that this is a well-typed expression of type int? »b must be a bool (b : bool) »2 must be an int (2 : int) »3 must be an int (3 : int)

Type Judgements v Type judgment notation: A E : T »Means “In the context A, the expression E is a well-typed expression with type T” v Type context is a set of type bindings: id : T »(i.e. type context = symbol table) »b: bool, x: int b: bool »b: bool, x: int if (b) 2 else x : int » : int    

Deriving a Judgment v To show »b: bool, x: int if (b) 2 else x : int v Need to show »b: bool, x: int b : bool »b: bool, x: int 2 : int »b: bool, x: int x : int    

General Rule v For any environment A, expression E, statements S1 and S2, the judgement: »A if (E) S1 else S2 : T v Is true if: »A E : bool »A S1 : T »A S2 : T    

Inference Rules A E : bool A S1 : T A S2 : T  A if (E) S1 else S2 : T  if-rule premises conclusion Holds for any choice of E, S1, S2, T Read as, “if we have established the statements in the premises listed above the line, then we may derive the conclusion below the line”

Why Inference Rules? v Inference rules: compact, precise language for specifying static semantics v Inference rules correspond directly to recursive AST traversal that implements them v Type checking is the attempt to prove type judgments A E : T true by walking backward through the rules 

Meaning of Inference Rule v Inference rule says: »Given the premises are true (with some substitutions for A, E1, E2) »Then, the conclusion is true (with consistent substitution) A E1 : int A E2 : int   (+) A E1 + E2 : int  E1E2 E1E2 :int +

Proof Tree v Expression is well-typed if there exists a type derivation for a type judgment v Type derivation is a proof tree v Example: if A1 b : bool, x : int, then: A1 b : bool A1 !b : bool   A1 2 : int  A1 3 : int  A : int  A1 x : int  b : bool, x : int if (!b) else x : int  

More About Inference Rules v No premises = axiom v A goal judgment may be proved in more than one way v No need to search for rules to apply – they correspond to nodes in the AST A true : bool  A E1 : float  A E2 : float  A E1 + E2 : float  A E1 : float  A E2 : int  A E1 + E2 : float 

Class Problem Given the following syntax for arithmetic expressions: t →true| false| if t then t else t| 0| succ t| pred t| iszero t And the following typing rules for the language: true : bool false : bool t1: bool t2: T t3 : T if t1 then t2 else t3 : T t1 : int succ t1 : int t1 : int pred t1 : int t1 : int iszero t1 : bool Construct a type derivations to show (1) if iszero 0 then 0 else pred 0 : int (2) pred(succ(succ(pred(0)))) : int

Assignment Statements id : T  A A E : T  A id = E : T  (variable-assign) A E3 : T A E2 : int A E1 : array[T]  A E1[E2] = E3 : T  (array-assign)  

If Statements A E : bool A S1 : T A S2 : T  A if (E) S1 else S2 : T  (if-then-else)  If statement as an expression: its value is the value of the clause that is executed  A E : bool A S : T  A if (E) S : unit  (if-then)  If with no else clause, no value, why??

Class Problem 1.Show the inference rule for a while statement, while (E) S 2.Show the inference rule for a variable declaration with initializer, Type id = E 3.Show the inference rule for a question mark/colon operator, E1 ? S1 : S2

Sequence Statements v Rule: A sequence of statements is well- typed if the first statement is well-typed, and the remaining are well-typed as well: A S1 : T1 A (S2;.... ; Sn) : Tn A (S1; S2;.... ; Sn) : Tn    (sequence)

Declarations A id : T [ = E ] : T1 A, id : T (S2;.... ; Sn) : Tn A (id : T [ = E ]; S2;.... ; Sn) : Tn    (declaration) = unit if no E Declarations add entries to the environment (e.g., the symbol table)

Function Calls v If expression E is a function value, it has a type T1 x T2 x... x Tn  Tr v Ti are argument types; Tr is the return type v How to type-check a function call? »E(E1,..., En) A E : T1 x T2 x... Tn  Tr A Ei : Ti (i  1... n)   A E(E1,..., En) : Tr  (function-call)

Function Declarations v Consider a function declaration of the form: »Tr fun (T1 a1,..., Tn an) = E »Equivalent to:  Tr fun (T1 a1,..., Tn an) {return E;} v Type of function body S must match declared return type of function, i.e., E : Tr v But, in what type context?

Add Arguments to Environment v Let A be the context surrounding the function declaration. »The function declaration:  Tr fun (T1 a1,..., Tn an) = E »Is well-formed if  A, a1 : T1,..., an : Tn E : Tr v What about recursion? »Need: fun: T1 x T2 x... x Tn  Tr  A 

Class Problem Recursive function – factorial int fact(int x) = if (x == 0) 1 else x * fact(x-1); Is this well-formed?

Mutual Recursion v Example »int f(int x) = g(x) + 1; »int g(int x) = f(x) – 1; v Need environment containing at least  f: int  int, g: int  int  when checking both f and g v Two-pass approach: »Scan top level of AST picking up all function signatures and creating an environment binding all global identifiers »Type-check each function individually using this global environment

Static Semantics Summary v Static semantics = formal specification of type-checking rules v Concise form of static semantics: typing rules expressed as inference rules v Expression and statements are well-formed (or well-typed) if a typing derivation (proof tree) can be constructed using the inference rules

Review of Semantic Analysis v Check errors not detected by lexical or syntax analysis v Scope errors »Variables not defined »Multiple declarations v Type errors »Assignment of values of different types »Invocation of functions with different number of parameters or parameters of incorrect type »Incorrect use of return statements

Other Forms of Semantic Analysis v One more category that we have not discussed v Control flow errors »Must verify that a break or continue statements are always encosed by a while (or for) stmt »Java: must verify that a break X statement is enclosed by a for loop with label X »Goto labels exist in the proper function »Can easily check control-flow errors by recursively traversing the AST

Where We Are... Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Gen Source code (character stream) token stream abstract syntax tree abstract syntax tree + symbol tables, types Intermediate code regular expressions grammars static semantics