Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.

Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University

Extensible Systems Everyone wants extensible systems: Web browser: new content, avoid communication OS Kernel: avoid context switches, copying Routers, switches: update protocols dynamically Servers, databases: new datatypes (e.g., video) In systems settings, extensions must be high- performance without violating the integrity of the core service. –shouldn’t crash the server (isolation) –shouldn’t violate invariants (e.g., locking protocols) –shouldn’t hog resources –shouldn’t leak the wrong information

Type-Safe Languages Type-safe high-level languages provide: –isolation (“memory safety”) can’t read/write/execute arbitrary hunks of memory really a side effect of... –guaranteed enforcement of abstraction invariants can only apply the right operation to the right data –unlike OS abstractions (pages, processes, etc.): not a fixed set -- users can define new datatypes with representation invariants specific to an application. enforcement is largely static (through type-checking) or at worst procedural So it appears as though all you need is a type- safe language like Java...

However: Conventional type systems like Java’s: –prevent programmers from choosing data representations (e.g., force a level of indirection). –still require many runtime tests: every array update involves both a dynamic type test and a bounds check. –rely upon garbage collection to do memory management. –do not address other integrity issues such as resource limits, deadlocks and starvation, etc. Result: performance is lacking and there are still integrity issues that must be addressed.

Proof-Carrying Code General framework that supports arbitrary code and arbitrary integrity constraints: –basic idea: extension = code + proof of safety –you can optimize the code all you want! –you just have to produce a proof… –if you can’t prove it -- insert a dynamic test Need programmer help to construct proofs. –Programmers do not like to construct real proofs. –So trick them into providing enough information that a theorem prover can do the rest. –Example: more type info makes it easier to prove memory safety.

Type-Safe Low-Level Languages Ideally, we want languages where: –as in C, we have control over representations, memory management, timing, etc. –as in Java, isolation and abstractions are enforced statically by the type system. –as in operating systems, resource limits and other integrity properties are also enforced by the type system. –we can use existing theorem provers to automatically produce proofs for PCC. –programmers don’t have to write too much type information (more info should imply better code).

Some Recent Promising Work Typed assembly language [Cornell, CMU] –Type system for Intel x86. –Fine-grained control over instructions, calling conventions, and data representations. Phase-split dependent types [CMU] –Types become more like a general logic –Ex: programmer control over array bound checks Region-based type systems [DIKU, Berkeley,Cornell] –Explicit control over memory management Resource-bounding type systems [CMU,Cornell] –Allows bounds to be expressed as function of input Information-flow type systems [Cornell, Bell Labs] –Prevent high-security data from leaking

Dependent Types int sum[i|i>=0](int{i} s, int a[i]) { int r = 0; for (int{j|0<=j<=i} x=0; x<s; x++){ r += a[x]; } return r; } The i is a logical variable used to link the value of s and the size of a.

Dependent Types int sum[i|i>=0](int{i} s, int a[i]) { int r = 0; for (int{j|0<=j<=i} x=0; x<s; x++){ r += a[x]; } return r; } In Java, this would require a runtime check that 0<= x < i. Here, the type-checker ensures the property statically.

Dependent Types int sum[i|i>=0](int{i} s, int a[i]); int m[20]; sum(10,m); sum(10*2,m); sum(z,m); Fails to type-check. Okay. Type-checks if z has type int{20} Conversely, programmer has to produce evidence at call-sites

Memory Management Memory Management in Type-Safe Settings: –Why not provide explicit malloc/free? The standard proof that strong typing is good enough to ensure memory safety relies upon the fact that the types of heap objects do not change. So recycling memory must be “implicit” in these settings, hence garbage collectors. –But determining “garbage” is undecidable: an object is garbage if it’s not touched in the future collectors approximate this using [global] reachability application-specific techniques are crucial for minimizing footprint, latency, throughput, etc.

Region-Based Management R3 R2 R1 Region: a collection of objects Implemented as a list of pages. You allocate objects into a region. No restriction on references. Regions can be dynamically allocated and deallocated.

Region-Based Management R3 R1 Unlike GC, dangling pointers aren’t a problem...

Region Types Each pointer’s type specifies the region: typedef struct{int x; int y;} point; *[r2]point copy[r1,r2](*[r1]point a) { *[r2]point b = new point[r2]; b->x = a->x; b->y = a->y; return b; } Functions are polymorphic in regions.

Region Management Regions can be [de]allocated at any point in time: –r = newregion(); … freeregion(r); The type system tracks when a region is freed using (essentially) data-flow analysis. Pointers can only be dereferenced if their region has not been freed. Functions say which regions they allocate and free. –as with dependent types, makes inter-procedural analysis scalable.

Pros and Cons of Regions Pros: –O(1) simple operations for allocation/deallocation –programmer control over placement and freeing –supports dangling pointers Cons: –requires lots of type information or heavy-duty inference/analysis (or both). –requires a very careful & tedious coding style often have to do your own “little” copying collection recursive types (e.g., lists, trees) must live in same region –shared regions among threads problematic

Summary and Future Today’s type systems are good, but not enough. –Limit control over performance. –Don’t provide everything (e.g., resource bounds). Recent advances in type systems can overcome many of these shortcomings. Critical issues for the future: –Addressing additional integrity/security issues adherence to protocols, liveness properties, etc. –Coherent integration of various systems. –Advanced analysis, inference, constraint-solving techniques.

Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.

Similar presentations

Presentation on theme: "Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University.

Similar presentations

Presentation on theme: "Advanced Type Systems for Low-Level Languages Greg Morrisett Cornell University."— Presentation transcript:

Similar presentations

About project

Feedback