Type Systems CS 6340 1. 2 Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Chair of Software Engineering From Program slicing to Abstract Interpretation Dr. Manuel Oriol.
Type checking © Marcelo d’Amorim 2010.
1 Chapter Six Algorithms. 2 Algorithms An algorithm is an abstract strategy for solving a problem and is often expressed in English A function is the.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
Comp 205: Comparative Programming Languages Semantics of Imperative Programming Languages denotational semantics operational semantics logical semantics.
1 Basic abstract interpretation theory. 2 The general idea §a semantics l any definition style, from a denotational definition to a detailed interpreter.
Bellevue University CIS 205: Introduction to Programming Using C++ Lecture 3: Primitive Data Types.
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
ESC Java. Static Analysis Spectrum Power Cost Type checking Data-flow analysis Model checking Program verification AutomatedManual ESC.
CS 536 Spring Global Optimizations Lecture 23.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Data Flow Analysis Compiler Design Nov. 3, 2005.
1 Lecture 7 Halting Problem –Fundamental program behavior problem –A specific unsolvable problem –Diagonalization technique revisited Proof more complex.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
1 Control Flow Analysis Mooly Sagiv Tel Aviv University Textbook Chapter 3
Type Checking in Cool Alex Aiken (Modified by Mooly Sagiv)
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
1 Program Analysis Mooly Sagiv Tel Aviv University Textbook: Principles of Program Analysis.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Describing Syntax and Semantics
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
CHAPTER 10 Recursion. 2 Recursive Thinking Recursion is a programming technique in which a method can call itself to solve a problem A recursive definition.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Analysis of Algorithms1 Running Time Pseudo-Code Analysis of Algorithms Asymptotic Notation Asymptotic Analysis Mathematical facts.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
Propositional Calculus CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Chapter 3 Part II Describing Syntax and Semantics.
Semantics In Text: Chapter 3.
Types and Programming Languages Lecture 11 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
CS412/413 Introduction to Compilers Radu Rugina Lecture 13 : Static Semantics 18 Feb 02.
Logical Agents Chapter 7. Outline Knowledge-based agents Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
CS Class 04 Topics  Selection statement – IF  Expressions  More practice writing simple C++ programs Announcements  Read pages for next.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
CS314 – Section 5 Recitation 9
Operational Semantics of Scheme
CSE341: Programming Languages Lecture 11 Type Inference
Type Systems CS 6340 {HEADSHOT}
CS 326 Programming Languages, Concepts and Implementation
September 4, 1997 Programming Languages (CS 550) Lecture 6 Summary Operational Semantics of Scheme using Substitution Jeremy R. Johnson TexPoint fonts.
Reasoning About Code.
Reasoning about code CSE 331 University of Washington.
Structural testing, Path Testing
Propositional Calculus: Boolean Algebra and Simplification
Objective of This Course
Programming Languages 2nd edition Tucker and Noonan
The Zoo of Software Security Techniques
CSE341: Programming Languages Lecture 11 Type Inference
Presentation transcript:

Type Systems CS

2 Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions. Static analysis Program analysis without running the program.

3 History Static analysis is nearly as old as programming First used in compilers –For program optimization –Starting with FORTRAN (1954) Use for software quality nearly as old –Type systems

4 So What? It ’ s a big field –With different approaches –And applications –And lots of terms This lecture aims to sketch basic –concepts –techniques –terminology

5 Laundry List Outline Analysis paradigms –Type systems –Dataflow analysis –Model checking Terminology –Abstract values –Flow insensitive –Flow sensitive –Path sensitive –Local vs. global analysis –Verification vs. bug-finding –Soundness –False positives and false negatives –...

6 Type Systems A Notation for Describing Static Analyses

7 Type Systems Types are the most widely used static analysis Part of nearly all mainstream languages –Widely recognized as important for quality Type notation is useful for discussing concepts –We use type notation to discuss type checking, dataflow analysis, and model checking

8 What is a Type? Consensus –A set of values Examples –Int is the set of all integers –Float is the set of all floats –Bool is the set {true, false}

9 More Examples List(Int) is the set of all lists of integers –List is a type constructor –A function from types to types Foo, in Java, is the set of all objects of class Foo Int -> Int is the set of functions mapping an integer to an integer –E.g., increment, decrement, and many others

10 What is a Type? Consensus –A set of values A type is an example of an abstract value –Represents a set of concrete values –Every static analysis has abstract values In type systems, –Every concrete value is an element of some abstract value –i.e., every concrete value has a type

11 Abstraction All static analyses use abstraction –Represent sets of values as abstract values Why? –Because we can ’ t reason directly about the infinite set of possible concrete values –For performance (even just termination), must make such approximations In type systems, the approximations are called types

12 The Next Step Now we need to compute with types –Actually analyze programs Type systems have a well-developed notation for expressing such computations

13 Rules of Inference Inference rules have the form If Hypothesis is true, then Conclusion is true Type checking computes via reasoning If E 1 and E 2 have certain types, then E 3 has a certain type Rules of inference are a compact notation for “ If-Then ” statements

14 From English to an Inference Rule Start with a simplified system and gradually add features Building blocks –Symbol  is “ and ” –Symbol  is “ if-then ” –x:T is “ x has type T ”

15 From English to an Inference Rule (2) If e 1 has type Int and e 2 has type Int, then e 1 + e 2 has type Int (e 1 has type Int  e 2 has type Int)  e 1 + e 2 has type Int (e 1 : Int  e 2 : Int)  e 1 + e 2 : Int

16 From English to an Inference Rule (3) The statement (e 1 : Int  e 2 : Int)  e 1 + e 2 : Int is a special case of Hypothesis 1 ...  Hypothesis n  Conclusion This is an inference rule.

17 Notation for Inference Rules By tradition inference rules are written |- Hypothesis 1... |- Hypothesis n |- Conclusion Type rules have hypotheses and conclusions |- e : T |- means “ it is provable that... ”

18 Two Rules [Int] [Add] i is an integer |- i : Int |- e 1 : Int |- e 2 : Int |- e 1 + e 2 : Int

19 Two Rules (Cont.) These rules give templates describing how to type integers and + expressions By filling in the templates, we can produce complete typings for expressions Note that –Hypotheses prove facts about subexpressions –Conclusions prove facts about the entire expression

20 Example: is an integer |- 1: Int 2 is an integer |- 2: Int | : Int

21 A Problem What is the type of a variable reference? The rule does not carry enough information to give x a type [Var] |- x: ? x is a variable

22 A Solution Put more information in the rules! An environment gives types for free variables –An environment is a function from variables to types –For other static analyses the environment may map variables to other abstract values –A variable is free in an expression if it is not defined within the expression

23 Type Environments Let A be a function from Variables to Types The sentence A |- e : T is read: Under the assumption that variables have the types given by A, it is provable that the expression e has the type T

24 Modified Rules The type environment is added to all rules: [Add] A |- e 1 : Int A |- e 2 : Int A |- e 1 + e 2 : Int

25 New Rules And we can write new rules: [Var] A |- x: T A(x) = T

26 Summary Describe static analyses using logics of the form: A ’ |- e ’ : T ’ A ’’ |- e ’’ : T ’’ A |- e : T The program (or program fragment) to be analyzed. Assumptions needed for aspects of e that are determined by e ’ s environment. The abstract value computed for e. Analysis of expression is recursively defined using analysis of subexpressions.

27 An Example

28 The Rule of Signs We want to estimate a computation ’ s sign Example: -3 * 4 = -12 Abstraction: - * + = -

29 Abstract Values + = {set of positive integers} 0 = { 0 } - = {set of negative integers} Environment: Variables -> {+, 0, - }

30 Example Rules A |- e 0 : + A |- e 1 : - A |- e 0 * e 1 : - A |- e 0 : 0 A |- e 1 : x A |- e 0 * e 1 : 0 A |- e 0 : + A |- e 1 : + A |- e 0 * e 1 : + A |- e 0 : x A |- e 1 : 0 A |- e 0 * e 1 : 0

31 A Problem A |- e 0 : + A |- e 1 : - A |- e 0 + e 1 : ? We don ’ t have an abstract value that covers this case! Solution: Add abstract values to ensure analysis is closed under all operations: + = { positive integers} 0 = { 0 } - = { negative integers} top = { all integers } bot = {}

32 More Example Rules A |- e 0 : + A |- e 1 : - A |- e 0 + e 1 : top A |- e 0 : 0 A |- e 1 : + A |- e 0 / e 1 : 0 A |- e 0 : + A |- e 1 : + A |- e 0 + e 1 : + A |- e 0 : top A |- e 1 : 0 A |- e 0 / e 1 : bot

33 Back To The Main Story...

34 A More Complex Rule [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 We ’ ll use this rule to illustrate several ideas...

35 Soundness [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 An analysis is sound if – Whenever A |- e: T – And if A(x) = T ’ then x has a value v ’ in T ’ – Then e evaluates to a value v in T e 0 is guaranteed to be a Bool e 1 and e 2 are guaranteed to be of the same type

36 Comments on Soundness Sound analyses are extremely useful –If a program has no errors according to a sound system, then no errors of that kind can arise at runtime –We have verified the absence of a class of errors Verification is a very strong guarantee –The property verified always holds

37 Comments on Soundness But soundness has a price in many applications –Spurious errors/warnings due to abstraction –These are false positives Alternative is to use unsound analyses –Allows some control of false positives –Introduces possibility of false negatives Undetected errors Type systems are sound –But most analyses used for detecting bugs are not sound –These are called bug finding analyses

38 Constraints [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 Side constraints must be solved Many analyses have side conditions Often constraints to be solved All constraints must be satisfied A separate algorithmic problem

39 Another Example Consider a recursive function f(x) = … f(e) … If x: T 1 and e: T 2 then T 2 = T 1 –Can be relaxed to T 2 subseteq T 1 Recursive functions yield recursive constraints –Same with loops –How hard constraints are to solve depends on constraint language, details of application

40 Algorithm [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 Algorithm: 1. Input: If-expression and A 2. Analyze e 0, check it is of type Bool 3. Analyze e 1 and e 2, giving types T 1 and T 2 4. Solve T 1 = T 2 5. Return T

41 Global Analysis [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 The first step requires the overall environment A - Only then can we analyze the subexpressions This is global analysis - Requires the entire program - Or we must somehow construct a model of the environment

42 Local Analysis [If-Then-Else] A 1 |- e 0 : Bool A 2 |- e 1 : T 1 A 3 |- e 2 : T 2 T 1 = T 2 A 1 = A 2 = A 3 A 1 |- if e 0 then e 1 else e 2 : T Algorithm: 1. Analyze e 0, inferring environment A 1 check type is Bool 2. Analyze e 1 and e 2, giving types T 1 and T 2 and environments A 2 and A 3 3. Solve T 1 = T 2 and A 1 = A 2 = A 3 4. Return T 1 and A 1

43 Local Analysis [If-Then-Else] A 1 |- e 0 : Bool A 2 |- e 1 : T 1 A 3 |- e 2 : T 2 T 1 = T 2 A 1 = A 2 = A 3 A 1 |- if e 0 then e 1 else e 2 : T In this approach, we first analyze the subexpressions, inferring the needed environment from the subexpression itself. Because the separately computed environments might not agree, they need to be constrained to be equal to get a valid analysis for the entire expression.

44 Local vs. Global Analysis Global analysis –Is usually simpler than local analysis –But may require a lot of extra engineering to construct models of the environment for partial programs Local analysis –Allows analysis of, e.g., a library without the client –Technically more difficult Requires allowing unknown paremeters in environments, which can be solved for later

45 Flow Insensitivity [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 Subexpressions are independent of each other No ordering required in the analysis of subexpressions, then analysis is flow insensitive Implies statements can be permuted and analysis is unaffected Type systems are generally flow-insensitive

46 Comments on Flow Insensitivity Flow insensitive analyses are often very efficient and scalable No need for modeling a separate state for each subexpression

47 Flow Insensitivity (Again) [If-Then-Else] A |- e 0 : Bool A |- e 1 : T 1 A |- e 2 : T 2 T 1 = T 2 A |- if e 0 then e 1 else e 2 : T 1 Subexpressions are independent of each other

48 Flow Sensitivity [If-Then-Else] A 0 |- e 0 : Bool, A 1 A 1 |- e 1 : T 1, A 2 A 1 |- e 2 : T 2, A 3 T 1 = T 2 A 2 = A 3 A |- if e 0 then e 1 else e 2 : T 1, A 2 Rules produce new environments and analysis of a subexpression cannot take place until its environment is available. Analysis of subexpressions is ordered by environments: flow sensitive Dataflow analysis is flow sensitive analysis The order of statements matters

49 Comments on Flow Sensitivity Example –Rule of signs extended with assignment statements A |- e : +, A A |- x := e, A[x -> +] –A[x -> +] means A modified so that A(x) = + Flow sensitive analysis can be expensive –Each statement has its own model of the state –Polynomial increase in cost over flow-insensitive

50 Path Sensitivity [If-Then-Else] P, A 0 |- e 0 : Bool, A 1 P ∧ e 0, A 1 |- e 1 : T 1, A 2 P ∧ !e 0, A 1 |- e 2 : T 2, A 3 T 1 = T 2 P, A |- if e 0 then e 1 else e 2 : T 1, e 0 ? A 2 : A 3 Part of the environment is a predicate saying under what conditions this expression is executed. Predicate is refined at decision points (e.g., if ’ s) At points where control paths merge, still keep different paths separate in the final environment

51 Comments on Path Sensitivity Model checking is flow- and path-sensitive analysis –In practice, path sensitive analyses are also flow sensitive –Although in theory they are independent ideas Path sensitivity can be very expensive –Exponential number of paths to track –But appears to be necessary in many applications Often implemented using backtracking instead of explicit merges of environment –Explore one path –Backtrack to a decision point, explore another path

52 Summary A very rough taxonomy –Type systems = flow-insensitive –Dataflow analysis = flow-sensitive –Model checking = flow- and path-sensitive But note –These lines have been blurred –E.g., lots of flow-sensitive type systems recently