Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.

Slides:



Advertisements
Similar presentations
Paging: Design Issues. Readings r Silbershatz et al: ,
Advertisements

Dynamic Memory Management
Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
CMSC 330: Organization of Programming Languages Memory and Garbage Collection.
Lecture 10: Heap Management CS 540 GMU Spring 2009.
Garbage Collection Introduction and Overview Christian Schulte Excerpted from presentation by Christian Schulte Programming Systems Lab Universität des.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
5. Memory Management From: Chapter 5, Modern Compiler Design, by Dick Grunt et al.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
ABCD: Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments.
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
CPSC 388 – Compiler Design and Construction
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
Chapter 10 Storage management
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
Typed Memory Management in a Calculus of Capabilities David Walker (with Karl Crary and Greg Morrisett)
Programmability with Proof-Carrying Code George C. Necula University of California Berkeley Peter Lee Carnegie Mellon University.
Honors Compilers Addressing of Local Variables Mar 19 th, 2002.
G Robert Grimm New York University Extensibility: SPIN and exokernels.
Laboratory for Computer Science Massachusetts Institute of Technology Ownership Types for Safe Region-Based Memory Management in Real-Time Java Chandrasekhar.
Language-Based Security Proof-Carrying Code Greg Morrisett Cornell University Thanks to G.Necula & P.Lee.
A Type System for Expressive Security Policies David Walker Cornell University.
C and Data Structures Baojian Hua
Linked lists and memory allocation Prof. Noah Snavely CS1114
1 CSE 303 Lecture 11 Heap memory allocation ( malloc, free ) reading: Programming in C Ch. 11, 17 slides created by Marty Stepp
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
Programming Languages: Design, Specification, and Implementation G Rob Strom October 19, 2006.
Secure Virtual Architecture John Criswell, Arushi Aggarwal, Andrew Lenharth, Dinakar Dhurjati, and Vikram Adve University of Illinois at Urbana-Champaign.
EECE 310: Software Engineering Lecture 2: Understanding Objects in Java and Types.
CS3012: Formal Languages and Compilers The Runtime Environment After the analysis phases are complete, the compiler must generate executable code. The.
Lecture 11 vector and Free Store Bjarne Stroustrup
EE4E. C++ Programming Lecture 1 From C to C++. Contents Introduction Introduction Variables Variables Pointers and references Pointers and references.
Types for Programs and Proofs Lecture 1. What are types? int, float, char, …, arrays types of procedures, functions, references, records, objects,...
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
CSC 253 Lecture 2. Some differences between Java and C  Compiled C code is machine specific, whereas Java compiles for a virt. machine.  Virtual machines.
Storage Management. The stack and the heap Dynamic storage allocation refers to allocating space for variables at run time Most modern languages support.
Basic Semantics Associating meaning with language entities.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
1 Records Record aggregate of data elements –Possibly heterogeneous –Elements/slots are identified by names –Elements in same fixed order in all records.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R F I V E Memory Management.
1 Dynamic Memory Allocation –The need –malloc/free –Memory Leaks –Dangling Pointers and Garbage Collection Today’s Material.
Pointers in C++. 7a-2 Pointers "pointer" is a basic type like int or double value of a pointer variable contains the location, or address in memory, of.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
ABCD: Eliminating Array-Bounds Checks on Demand Rastislav Bodík Rajiv Gupta Vivek Sarkar U of Wisconsin U of Arizona IBM TJ Watson recent experiments.
Computer Programming 2 Why do we study Java….. Java is Simple It has none of the following: operator overloading, header files, pre- processor, pointer.
Combining Garbage Collection and Safe Manual Memory Management Michael Hicks University of Maryland, College Park Joint work with Greg Morrisett - Harvard,
CSE 425: Control Abstraction I Functions vs. Procedures It is useful to differentiate functions vs. procedures –Procedures have side effects but usually.
Processes and Virtual Memory
Pointers. Variable Declarations Declarations served dual purpose –Specification of range of values and operations –Specification of Storage requirement.
Polymorphism Discrete Mathematics and Its Applications Baojian Hua
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
Consider Starting with 160 k of memory do: Starting with 160 k of memory do: Allocate p1 (50 k) Allocate p1 (50 k) Allocate p2 (30 k) Allocate p2 (30 k)
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
CS533 Concepts of Operating Systems Jonathan Walpole.
Lecture 4 Page 1 CS 111 Online Modularity and Memory Clearly, programs must have access to memory We need abstractions that give them the required access.
Records type city is record -- Ada Name: String (1..10); Country : String (1..20); Population: integer; Capital : Boolean; end record; struct city { --
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Automatic Memory Management Without Run-time Overhead Brian Brooks.
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Object Lifetime and Pointers
Types for Programs and Proofs
Protecting Memory What is there to protect in memory?
Component Based Software Engineering
Java Programming Language
Storage Management.
Modularity and Memory Clearly, programs must have access to memory
Storage.
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
Presentation transcript:

Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University

Extensible Systems Everyone wants extensible systems: Web browser: new content, avoid communication OS Kernel: avoid context switches, copying Routers, switches: update protocols dynamically Servers, databases: new datatypes (e.g., video) In systems settings, extensions must be high- performance without violating the integrity of the core service. –shouldn’t crash the server (isolation) –shouldn’t violate invariants (e.g., locking protocols) –shouldn’t hog resources –shouldn’t leak the wrong information

Type-Safe Languages Type-safe high-level languages provide: –isolation (“memory safety”) can’t read/write/execute arbitrary hunks of memory really a side effect of... –guaranteed enforcement of abstraction invariants can only apply the right operation to the right data –unlike OS abstractions (pages, processes, etc.): not a fixed set -- users can define new datatypes with representation invariants specific to an application. enforcement is largely static (through type-checking) or at worst procedural So it appears as though all you need is a type- safe language like Java...

However: Conventional type systems like Java’s: –prevent programmers from choosing data representations (e.g., force a level of indirection). –still require many runtime tests: every array update involves both a dynamic type test and a bounds check. –rely upon garbage collection to do memory management. –do not address other integrity issues such as resource limits, deadlocks and starvation, etc. Result: performance is lacking and there are still integrity issues that must be addressed.

Proof-Carrying Code General framework that supports arbitrary code and arbitrary integrity constraints: –basic idea: extension = code + proof of safety –you can optimize the code all you want! –you just have to produce a proof… –if you can’t prove it -- insert a dynamic test Need programmer help to construct proofs. –Programmers do not like to construct real proofs. –So trick them into providing enough information that a theorem prover can do the rest. –Example: more type info makes it easier to prove memory safety.

Type-Safe Low-Level Languages Ideally, we want languages where: –as in C, we have control over representations, memory management, timing, etc. –as in Java, isolation and abstractions are enforced statically by the type system. –as in operating systems, resource limits and other integrity properties are also enforced by the type system. –we can use existing theorem provers to automatically produce proofs for PCC. –programmers don’t have to write too much type information (more info should imply better code).

Some Recent Promising Work Typed assembly language [Cornell, CMU] –Type system for Intel x86. –Fine-grained control over instructions, calling conventions, and data representations. Phase-split dependent types [CMU] –Types become more like a general logic –Ex: programmer control over array bound checks Region-based type systems [DIKU, Berkeley,Cornell] –Explicit control over memory management Resource-bounding type systems [CMU,Cornell] –Allows bounds to be expressed as function of input Information-flow type systems [Cornell, Bell Labs] –Prevent high-security data from leaking

Dependent Types int sum[i|i>=0](int{i} s, int a[i]) { int r = 0; for (int{j|0<=j<=i} x=0; x<s; x++){ r += a[x]; } return r; } The i is a logical variable used to link the value of s and the size of a.

Dependent Types int sum[i|i>=0](int{i} s, int a[i]) { int r = 0; for (int{j|0<=j<=i} x=0; x<s; x++){ r += a[x]; } return r; } In Java, this would require a runtime check that 0<= x < i. Here, the type-checker ensures the property statically.

Dependent Types int sum[i|i>=0](int{i} s, int a[i]); int m[20]; sum(10,m); sum(10*2,m); sum(z,m); Fails to type-check. Okay. Type-checks if z has type int{20} Conversely, programmer has to produce evidence at call-sites

Memory Management Memory Management in Type-Safe Settings: –Why not provide explicit malloc/free? The standard proof that strong typing is good enough to ensure memory safety relies upon the fact that the types of heap objects do not change. So recycling memory must be “implicit” in these settings, hence garbage collectors. –But determining “garbage” is undecidable: an object is garbage if it’s not touched in the future collectors approximate this using [global] reachability application-specific techniques are crucial for minimizing footprint, latency, throughput, etc.

Region-Based Management R3 R2 R1 Region: a collection of objects Implemented as a list of pages. You allocate objects into a region. No restriction on references. Regions can be dynamically allocated and deallocated.

Region-Based Management R3 R1 Unlike GC, dangling pointers aren’t a problem...

Region Types Each pointer’s type specifies the region: typedef struct{int x; int y;} point; *[r2]point copy[r1,r2](*[r1]point a) { *[r2]point b = new point[r2]; b->x = a->x; b->y = a->y; return b; } Functions are polymorphic in regions.

Region Management Regions can be [de]allocated at any point in time: –r = newregion(); … freeregion(r); The type system tracks when a region is freed using (essentially) data-flow analysis. Pointers can only be dereferenced if their region has not been freed. Functions say which regions they allocate and free. –as with dependent types, makes inter-procedural analysis scalable.

Pros and Cons of Regions Pros: –O(1) simple operations for allocation/deallocation –programmer control over placement and freeing –supports dangling pointers Cons: –requires lots of type information or heavy-duty inference/analysis (or both). –requires a very careful & tedious coding style often have to do your own “little” copying collection recursive types (e.g., lists, trees) must live in same region –shared regions among threads problematic

Summary and Future Today’s type systems are good, but not enough. –Limit control over performance. –Don’t provide everything (e.g., resource bounds). Recent advances in type systems can overcome many of these shortcomings. Critical issues for the future: –Addressing additional integrity/security issues adherence to protocols, liveness properties, etc. –Coherent integration of various systems. –Advanced analysis, inference, constraint-solving techniques.