Prof. Aiken CS 294 Lecture 31 Type-Based Analysis Lecture 3.

Slides:

Advertisements

Similar presentations

Types and Programming Languages Lecture 13 Simon Gay Department of Computing Science University of Glasgow 2006/07.

Advertisements

5 October 2001IFIP WG 2.31 preparation start drscheme set language level to Beginner open in Presentations –length1.ss –planets0.ss –lambda.ss –convert.ss.

Chapter 3 Disks and Formatting Ch 3.

Modern Programming Languages, 2nd ed.

Copyright © 2008 Cengage Learning Understanding Generalist Practice, 5e, Kirst-Ashman/Hull 316.

Software Engineering Principles

Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16 th, March, 2004.

Type Inference David Walker COS 320. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.

The Pumping Lemma for CFL’s

Type Checking, Inference, & Elaboration CS153: Compilers Greg Morrisett.

Semantics Static semantics Dynamic semantics attribute grammars

Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.

1 PROPERTIES OF A TYPE ABSTRACT INTERPRETATER. 2 MOTIVATION OF THE EXPERIMENT § a well understood case l type inference in functional programming à la.

Getting started with ML ML is a functional programming language. ML is statically typed: The types of literals, values, expressions and functions in a.

Flow-Insensitive Points-to Analysis with Term and Set Constraints Presentation by Kaleem Travis Patrick.

INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.

Compiler Construction

1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture12: Reductions Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.

1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.

Introduction to Computability Theory

Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.

6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)

ML: a quasi-functional language with strong typing Conventional syntax: - val x = 5; (*user input *) val x = 5: int (*system response*) - fun len lis =

Type Checking- Contd Compiler Design Lecture (03/02/98) Computer Science Rensselaer Polytechnic.

Catriel Beeri Pls/Winter 2004/5 type reconstruction 1 Type Reconstruction & Parametric Polymorphism  Introduction  Unification and type reconstruction.

Static Program Analysis Xiangyu Zhang The slides are compiled from Alex Aiken’s Michael D. Ernst’s Sorin Lerner’s.

Programming Language Semantics Mooly SagivEran Yahav Schrirber 317Open space html://

Catriel Beeri Pls/Winter 2004/5 inductive-revisited 1 Inductive definitions revisited  Generated and Freely generated sets oPattern match, unification.

Catriel Beeri Pls/Winter 2004/05 types 65  A type-checking algorithm The task: (since we start with empty H, why is the goal not just E?) The rule set.

Type Inference David Walker COS 441. Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs useful.

Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)

Type Inference David Walker CS 510, Fall Criticisms of Typed Languages Types overly constrain functions & data polymorphism makes typed constructs.

CS 280 Data Structures Professor John Peterson. Lexer Project Questions? Must be in by Friday – solutions will be posted after class The next project.

Normal forms for Context-Free Grammars

Semantics with Applications Mooly Sagiv Schrirber html:// Textbooks:Winskel The.

Prof. Aiken CS 294 Lecture 21 Abstract Interpretation Part 2.

1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.

Describing Syntax and Semantics

CSE 413 Programming Languages & Implementation Hal Perkins Autumn 2012 Context-Free Grammars and Parsing 1.

1/25 Pointer Logic Changki PSWLAB Pointer Logic Daniel Kroening and Ofer Strichman Decision Procedure.

Induction and recursion

1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.

Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.

Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.

Introduction to Parsing

Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.

Implementing a Dependently Typed λ -Calculus Ali Assaf Abbie Desrosiers Alexandre Tomberg.

12/9/20151 Programming Languages and Compilers (CS 421) Elsa L Gunter 2112 SC, UIUC Based in part on slides by Mattox.

Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.

Types and Programming Languages Lecture 14 Simon Gay Department of Computing Science University of Glasgow 2006/07.

CS7120 (Prasad)L13-1-Lambda-Opt1 Typed Lambda Calculus Adapted from Lectures by Profs Aiken and Necula of Univ. of California at Berkeley.

Types and Programming Languages Lecture 3 Simon Gay Department of Computing Science University of Glasgow 2006/07.

COMP 412, FALL Type Systems C OMP 412 Rice University Houston, Texas Fall 2000 Copyright 2000, Robert Cartwright, all rights reserved. Students.

CS5205: Foundation in Programming Languages Type Reconstruction

Manuel Fahndrich Jakob Rehof Manuvir Das

Type Checking and Type Inference

Programming Languages and Compilers (CS 421)

Parsing & Context-Free Grammars

CS510 Compiler Lecture 4.

CS 611: Lecture 9 More Lambda Calculus: Recursion, Scope, and Substitution September 17, 1999 Cornell University Computer Science Department Andrew Myers.

Representation, Syntax, Paradigms, Types

Compiler Construction

Compilers Principles, Techniques, & Tools Taught by Jing Zhang

Compiler Construction

Presentation transcript:

Prof. Aiken CS 294 Lecture 31 Type-Based Analysis Lecture 3

Prof. Aiken CS 294 Lecture 32 Comments on Abstract Interpretation Why is abstract interpretation either forwards or backwards? Answer 1 –Polynomial to compute in one direction –Exponential to compute in the other direction Answer 2 –Abstract functions often implemented as functions –Impossible to invert---they’re code!

Prof. Aiken CS 294 Lecture 33 Outline A language –Lambda calculus Types –Type checking –Type inference Applications to program analysis –Representation analysis –Tagging optimization –Alias analysis

Prof. Aiken CS 294 Lecture 34 The Typed Lambda Calculus Lambda calculus –But types are assigned to bound variables. –Pascal, or C Add integers, addition, if-then-else Note: Not every expression generated by this grammar is a properly typed term.

Prof. Aiken CS 294 Lecture 35 Types Function types Integers Type variables –Stand for definite, but unknown, types

Prof. Aiken CS 294 Lecture 36 Function Types Intuitively, a type  1 !  2 stands for the set of functions that map arguments of type  1 to results of type  2. Placeholder for any other structured datatype –Lists –Trees –Arrays

Prof. Aiken CS 294 Lecture 37 Types are Trees Types are terms Any term can be represented by a tree –The parse tree of the term –Tree representation is important in algorithms (  ! int) !  ! int  int  ! !!

Prof. Aiken CS 294 Lecture 38 Examples We write e:t for the statement “e has type t.”

Prof. Aiken CS 294 Lecture 39 Untypable Terms Some terms have no valid typing. – x.x x – x. y. (x y) + x Focus on first example –Types are finite –Becomes typable if we allow recursive types Recursive types are possibly infinite, regular trees

Prof. Aiken CS 294 Lecture 310 Type Environments To determine whether the types in an expression are correct we perform type checking. But we need types for free variables, too! A type environment is a function from variables to types. The syntax of environments is: The meaning is:

Prof. Aiken CS 294 Lecture 311 Type Checking Rules Type checking is done by structural induction. –One inference rule for each form –Assumptions contain types of free variables –A term is well-typed if ;  e: 

Prof. Aiken CS 294 Lecture 312 Example

Prof. Aiken CS 294 Lecture 313 Type Checking Algorithm There is a simple algorithm for type checking Observe that there is only one possible “shape” of the type derivation –only one inference rule applies to each form.

Prof. Aiken CS 294 Lecture 314 Algorithm (Cont.) Walk the proof tree from the root to the leaves, generating the correct environments. Assumptions are simply gathered from lambda abstractions.

Prof. Aiken CS 294 Lecture 315 Algorithm (Cont.) In a walk from the leaves to the root, calculate the type of each expression. The types are completely determined by the type environment and the types of subexpressions.

Prof. Aiken CS 294 Lecture 316 A Bigger Example

Prof. Aiken CS 294 Lecture 317 What Do Types Mean? Thm. If A  e:  and e ! ¤  d, then A  d:  –Evaluation preserves types. This is the basis of a claim that there can be no runtime type errors –functions applied to data of the wrong type Adding to a function Using an integer as a function

Prof. Aiken CS 294 Lecture 318 Type Inference The type erasure of e is e with all type information removed (i.e., the untyped term). Is an untyped term the erasure of some simply typed term? And what are the types? This is a type inference problem. We must infer, rather than check, the types.

Prof. Aiken CS 294 Lecture 319 Outline We will develop the inference algorithm in the following steps: –recast the type rules in an equivalent form –show typing in the new rules reduces to a constraint satisfaction problem –show the constraint problem is solvable via term unification. We will use this outline again.

Prof. Aiken CS 294 Lecture 320 The Problems There are three problems in developing an algorithm –How do we construct the right type assumptions? –How do we ensure types match in applications? –How do we ensure types match in if-then-else?

Prof. Aiken CS 294 Lecture 321 New Rules Sidestep the problems by introducing explicit unknowns and constraints

Prof. Aiken CS 294 Lecture 322 New Rules Type assumption for variable x is a fresh variable  x

Prof. Aiken CS 294 Lecture 323 New Rules Equality conditions represented as side constraints

Prof. Aiken CS 294 Lecture 324 New Rules Hypotheses are all arbitrary –Can always complete a derivation, pending constraint resolution

Prof. Aiken CS 294 Lecture 325 Notes The introduction of unknowns and constraints works only because the shape of the proof is already known. –This tells us where to put the constraints and unknowns. The revised rules are trivial to implement, except for handling the constraints.

Prof. Aiken CS 294 Lecture 326 Solutions of Constraints The new rules generate a system of type equations. Intuitively, a solution of these equations gives a derivation. A solution is a substitution Vars ! Types such that the equations are satisfied.

Prof. Aiken CS 294 Lecture 327 Example A solution is

Prof. Aiken CS 294 Lecture 328 Solving Type Equations Term equations are a unification problem. –Solvable in near-linear time using a union-find based algorithm. No solutions  = T[  ] are permitted –The occurs check. –The check is omitted if we allow infinite types.

Prof. Aiken CS 294 Lecture 329 Unification Close constraints under four rules. If no inconsistency or occurs check violation found, system has a solution. –int = x ! y

Prof. Aiken CS 294 Lecture 330 Syntax We distinguish solved equations    Each rule manipulates only unsolved equations.

Prof. Aiken CS 294 Lecture 331 Rules 1 and 4 Rules 1 and 4 eliminate trivial constraints. Rule 1 is applied in preference to rule 2 –the only such possible conflict

Prof. Aiken CS 294 Lecture 332 Rule 2 Rule 2 eliminates a variable from all equations but one (which is marked as solved). –Note the variable is eliminated from all unsolved as well as solved equations

Prof. Aiken CS 294 Lecture 333 Rule 3 Rule 3 applies structural equality to non-trivial terms. Note rule 4 is a degenerate case of rule 3 for a type constructor of arity zero.

Prof. Aiken CS 294 Lecture 334 Correctness Each rule preserves the set of solutions. –Rules 1 and 4 eliminate trivial constraints. –Rule 2 substitutes equals for equals. –Rule 3 is the definition of equality on function types.

Prof. Aiken CS 294 Lecture 335 Termination Rules 1 and 4 reduce the number of equations. Rule 2 reduces the number of variables in unsolved equations. Rule 3 decreases the height of terms.

Prof. Aiken CS 294 Lecture 336 Termination (Cont.) Rules 1, 3, and 4 always terminate –because terms must eventually be reduced to height 0. Eventually rule 2 is applied, reducing the number of variables.

Prof. Aiken CS 294 Lecture 337 A Nitpick We really need one more operation.  =  should be flipped to  =  if  is not a variable. –Needed to ensure rule 2 applies whenever possible. –We just assume equations are maintained in this “normal form”.

Prof. Aiken CS 294 Lecture 338 Solutions The final system is a solution. –There is one equation    for each variable. –This is a substitution with all the solutions of the original system Must also perform occurs check to guarantee there are no recursive constraints.

Prof. Aiken CS 294 Lecture 339 Example rewrites

Prof. Aiken CS 294 Lecture 340 An Example of Failure

Prof. Aiken CS 294 Lecture 341 Notes The algorithm produces the most general unifier of the equations. –All solutions are preserved. Less general solutions are all substitution instances of the most general solution.

Prof. Aiken CS 294 Lecture 342 An Efficient Algorithm The algorithm we have sketched is polynomial, but not very efficient. The repeated substitutions on types is slow. Idea: Maintain equivalence classes of types directly.

Prof. Aiken CS 294 Lecture 343 Union/Find Consider sets in which one element is the designated representative. –If int or ! is in a set, then it is the representative –o.w. the representative is arbitrary. Two operations –Union(s,t) union two sets together –Find(s) return the representative of set s Equal types will be put in the same set.

Prof. Aiken CS 294 Lecture 344 Algorithm Rules 1 and 4 Rule 2 Rule 3

Prof. Aiken CS 294 Lecture 345 Example  =  !   =  !   = int  ! 

Prof. Aiken CS 294 Lecture 346 Example  =  !   =  !   = int  !  !

Prof. Aiken CS 294 Lecture 347 Example  =  !   =  !   = int  !  !

Prof. Aiken CS 294 Lecture 348 Example  =  !   =  !   = int  int !  !

Prof. Aiken CS 294 Lecture 349 Example  =  !   =  !   = int  int !  !

Prof. Aiken CS 294 Lecture 350 Notes Any sequence of union and find operations can be made to run in nearly linear time (amortized). The constants are very small, giving excellent performance in practice.

Prof. Aiken CS 294 Lecture 351 Applications

Prof. Aiken CS 294 Lecture 352 Representation Analysis Which values in a program must have the same representation? –Not all values of a type need be represented identically –Shows abstraction boundaries Which values must have the same representation? –Those that are used “together”

Prof. Aiken CS 294 Lecture 353 The Idea Old type language New type language –Every type is a pair: old type x variable

Prof. Aiken CS 294 Lecture 354 Type Inference Rules

Prof. Aiken CS 294 Lecture 355 Example A lambda term: x. y. z. w.if (x + y) (z + 1) w Equivalence classes: x. y. z. w.if (x + y) (z + 1) w

Prof. Aiken CS 294 Lecture 356 Uses “Re-engineering” –Make some values more abstract Find bugs –Every equivalence class with a malloc should have a free Implemented for C in a tool Lackwit –O’Callahan & Jackson

Prof. Aiken CS 294 Lecture 357 Dynamic Tag Optimization Untyped languages need runtime tags –To do runtime type checking –E.g., Lisp, Scheme Consider an untyped version of our language: Every value carries a tag –For us, just 1 bit “function” or “integer”

Prof. Aiken CS 294 Lecture 358 Term Completion View lambda terms as “incomplete” –Still need the tagging/tag checking operations –T! Tags a value as having type T Every operation that constructs a T must invoke T! –T? Checks if a value has the tag for type T Every operation that expects to use a T must invoke T? Example – f. x. f (x + 1) – fun! f.(fun! x. (fun? f) (int! ((int? x) + (int? (int! 1)))))

Prof. Aiken CS 294 Lecture 359 Tagging Optimization Optimization problem: remove pairs of tag/untag operations without changing program semantics fun! f.(fun! x. (fun? f) (int! ((int? x) + (int? (int! 1))))) –fun! f.(fun! x. (fun? f) (int! ((int? x) + 1)))

Prof. Aiken CS 294 Lecture 360 Coercions The tagging/untagging operations are coercions –Functions that change the type –But change it to what? –Introduce type dynamic  to indicate tagged values New types:

Prof. Aiken CS 294 Lecture 361 Coercion Signatures With type dynamic, we can give signatures to the coercions: int!: int ! > int?: > ! int func!: ( > ! > ) ! > func?: > ! ( > ! > ) noop:  !  Problem: Decide whether to insert proper coercions or noop.

Prof. Aiken CS 294 Lecture 362 Type Ordering and Constraints Types are related by tagging operations: int  > > ! >  >    Now the choice of a proper coercion or noop can be captured by a constraint: int   –Says  is either > or int

Prof. Aiken CS 294 Lecture 363 Type Inference Rules

Prof. Aiken CS 294 Lecture 364 Constraint Resolution Rules Note: Arguments of ! and rhs of inequality constraints are always variables

Prof. Aiken CS 294 Lecture 365 Complexity Inequality constraints are generated only by inference rules –No new ones are ever added by resolution All constraint resolution is of equality constraints –Runs at the speed of unification Solution of the constraints shows where to insert coercions

Prof. Aiken CS 294 Lecture 366 Alias Analysis In languages with side effects, want to know which locations may have aliases –More than one “name” –More than one pointer to them E.g., Y = &Z X = Y *X = 3 /* changes the value of *Y */

Prof. Aiken CS 294 Lecture 367 The Types Deal just with pointers and atomic data

Prof. Aiken CS 294 Lecture 368 A Type Rule Consider a C assignment x = y Intuition: x points to whatever y points to

Prof. Aiken CS 294 Lecture 369 A Problem X and Y are always references –They’re variables But what their contents may be atomic: A = 4 X = A Y = A Now x and y are inferred to always “point” to the same thing –But it is obvious there are no pointers here

Prof. Aiken CS 294 Lecture 370 Type Ordering Define an ordering on types:  1   2, (  1 = ? Ç  1 =  2 ) Change the inference rule:

Prof. Aiken CS 294 Lecture 371 Example Inference Rules

Prof. Aiken CS 294 Lecture 372 Constraint Resolution Rules

Prof. Aiken CS 294 Lecture 373 Implementation No new inequality constraints are generated by resolution Keep a list of pending equality constraints for each variable  –These constraints “fire” when  is unified with a ref More generally, unified with a constructor

Prof. Aiken CS 294 Lecture 374 Context Sensitivity: Polymorphic Types Add a new class of types called type schemes: Example: A polymorphic identity function Note: All quantifiers are at top level.

Prof. Aiken CS 294 Lecture 375 A Useful Lemma A variable in a typing proof can be instantiated to something more specific and the proof still works. Proof: Replace by in derivation for e. Show by cases the derivation is still correct.

Prof. Aiken CS 294 Lecture 376 The Key Idea This is called generalization.

Prof. Aiken CS 294 Lecture 377 Instantiation Polymorphic assumptions can be used as usual. But we still need to turn a polymorphic type into a monomorphic type for the other type rules to work.

Prof. Aiken CS 294 Lecture 378 Where is Type Inference Strong? Handles data structures smoothly Works in infinite domains –Set of types is unlimited No forwards/backwards distinction Type polymorphism good fit for context sensitivity –Lexically based –Less sensitive to program edits than call strings

Prof. Aiken CS 294 Lecture 379 Where is Type Inference Weak? No flow sensitivity –Equality-based analysis only gets equivalence classes –“backflow” problem Context-sensitive analyses don’t always scale –Type polymorphism can lead to exponential blowup in constraints