SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University of Illinois at Urbana-Champaign Joint work with Sumant.

Slides:



Advertisements
Similar presentations
Introduction to Memory Management. 2 General Structure of Run-Time Memory.
Advertisements

Automatic Memory Management Noam Rinetzky Schreiber 123A /seminar/seminar1415a.html.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Pointer Analysis – Part I Mayur Naik Intel Research, Berkeley CS294 Lecture March 17, 2009.
The Interface Definition Language for Fail-Safe C Kohei Suenaga, Yutaka Oiwa, Eijiro Sumii, Akinori Yonezawa University of Tokyko.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Various languages….  Could affect performance  Could affect reliability  Could affect language choice.
Automatic Pool Allocation: Compile-Time Control Over Complete Pointer-Based Data Structures Vikram Adve University of Illinois at Urbana-Champaign Joint.
SAFECode Memory Safety Without Runtime Checks or Garbage Collection By Dinakar Dhurjati Joint work with Sumant Kowshik, Vikram Adve and Chris Lattner University.
1 Day 03 Introduction to C. 2 Memory layout and addresses r s int x = 5, y = 10; float f = 12.5, g = 9.8; char c = ‘r’, d = ‘s’;
Korey Breshears. Overview  What are automated security tools?  Why do we need them?  What types of tools are there?  What problems do these tools.
By Emery D. Berger and Benjamin G. Zorn Presented by: David Roitman.
Introduction The Approach ’ s Overview A Language of Pointers The Type System Operational Semantics Type Safety Type Inference The Rest of C Experiments.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
Static Analysis of Embedded C Code John Regehr University of Utah Joint work with Nathan Cooprider.
Securing software by enforcing data-flow integrity Manuel Costa Joint work with: Miguel Castro, Tim Harris Microsoft Research Cambridge University of Cambridge.
Type-Safe Programming in C George Necula EECS Department University of California, Berkeley.
1 Motivation Dynamically allocated storage and pointers are an essential programming tools –Object oriented –Modularity –Data structure But –Error prone.
Run-Time Error Handling Wes Weimer, George Necula.
Chapter 3.7 Memory and I/O Systems. 2 Memory Management Only applies to languages with explicit memory management (C or C++) Memory problems are one of.
Detecting Memory Errors using Compile Time Techniques Nurit Dor Mooly Sagiv Tel-Aviv University.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science PLDI 2006 DieHard: Probabilistic Memory Safety for Unsafe Programming Languages Emery.
CCured in the Real World Jeremy ConditMatthew Harren Scott McPeakGeorge Necula Westley Weimer OSQ Retreat May 14, 2003.
HARDBOUND: ARCHITECURAL SUPPORT FOR SPATIAL SAFETY OF THE C PROGRAMMING LANGUAGE Kyle Yan Yu Xing 2014/10/15.
May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To.
May 22, 2002OSQ Retreat 1 CCured: Taming C Pointers George Necula Scott McPeak Wes Weimer
Automatic Pool Allocation for Disjoint Data Structures Presented by: Chris Lattner Joint work with: Vikram Adve ACM.
Checking Memory Safety with BLAST Dirk Beyer, et al. FASE 2005 KAIST CS750b 2006 Fall Seonggun Kim.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science 2006 Exterminator: Automatically Correcting Memory Errors Gene Novark, Emery Berger.
The Impact of Programming Language Theory on Computer Security Drew Dean Computer Science Laboratory SRI International.
Safety in the C programming Language Peter Wihl May 26 th, 2005 CS 297 Security and Programming Languages.
Feudal C Automatic memory management with zero runtime overhead CS263 - Spring 1999 Scott McPeak Dan Bonachea Carol Hurwitz C.
Secure Virtual Architecture: A Safe Execution Environment for Commodity Operating Systems John Criswell, University of Illinois Andrew Lenharth, University.
Secure Virtual Architecture John Criswell, Arushi Aggarwal, Andrew Lenharth, Dinakar Dhurjati, and Vikram Adve University of Illinois at Urbana-Champaign.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Secure Virtual Architecture:
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Backwards-Compatible Array Bounds Checking for C with Very Low Overhead Dinakar Dhurjati and Vikram Adve ICSE 2006 Itay Polack
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Chapter 3.5 Memory and I/O Systems. 2 Memory Management Memory problems are one of the leading causes of bugs in programs (60-80%) MUCH worse in languages.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Computer Science and Software Engineering University of Wisconsin - Platteville 2. Pointer Yan Shi CS/SE2630 Lecture Notes.
Extended Static Checking for Java  ESC/Java finds common errors in Java programs: null dereferences, array index bounds errors, type cast errors, race.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
Mark Marron 1, Deepak Kapur 2, Manuel Hermenegildo 1 1 Imdea-Software (Spain) 2 University of New Mexico 1.
Dynamic Memory Allocation. Domain A subset of the total domain name space. A domain represents a level of the hierarchy in the Domain Name Space, and.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
CSCI Rational Purify 1 Rational Purify Overview Michel Izygon - Jim Helm.
Pointers in C Computer Organization I 1 August 2009 © McQuain, Feng & Ribbens Memory and Addresses Memory is just a sequence of byte-sized.
Transparent Pointer Compression for Linked Data Structures June 12, 2005 MSP Chris Lattner Vikram Adve.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
Data Flow Analysis for Software Prefetching Linked Data Structures in Java Brendon Cahoon Dept. of Computer Science University of Massachusetts Amherst,
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
Extended Static Checking for Java Cormac Flanagan Joint work with: Rustan Leino, Mark Lillibridge, Greg Nelson, Jim Saxe, and Raymie Stata Compaq Systems.
VM: Chapter 7 Buffer Overflows. csci5233 computer security & integrity (VM: Ch. 7) 2 Outline Impact of buffer overflows What is a buffer overflow? Types.
GC Assertions: Using the Garbage Collector To Check Heap Properties Samuel Z. Guyer Tufts University Edward Aftandilian Tufts University.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Memory-Related Perils and Pitfalls in C
Seminar in automatic tools for analyzing programs with dynamic memory
Checking Memory Management
CSCI206 - Computer Organization & Programming
High Coverage Detection of Input-Related Security Faults
SUDS: An Infrastructure for Creating Bug Detection Tools
Runtime Monitoring of C Programs for Security and Correctness
Code-Pointer Integrity
CETS: Compiler-Enforced Temporal Safety for C
point when a program element is bound to a characteristic or property
Suan Hsi Yong University of Wisconsin – Madison
Presentation transcript:

SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University of Illinois at Urbana-Champaign Joint work with Sumant Kowshik, Vikram Adve

SAFECode Weakly Typed Languages (C/C++) Weak semantic guarantees –Undetected array bounds errors, dangling pointer errors, type cast errors, uninitialized pointers, etc.  Memory safety violations  Any static analysis is suspect Widely Ignored

SAFECode Static Analysis Tools Memory errors invalidate core analyses Yes or No property Software Tools (e.g. ESP, BLAST) C program Normal Compiler Alias analysis, Call graph, Type information Core Analyses ≈

SAFECode Why not use safe languages? Large body of legacy applications in C/C++ Porting is not easy –Automatic memory management or GC –Wrappers for library calls because of metadata on pointers Java, C#, safe dialects of C (e.g. CCured, Cyclone)

SAFECode Our Solution: SAFECode Not a safe language : tolerates errors Completely automatic, no wrappers, no GC Works for nearly all C programs Low overhead (less than 30% in our expts) Provides sound analysis platform –Sound operational semantics for C based on core analyses Masks dangling pointer, array bounds errors Ensures memory safety (defined later)

SAFECode SAFECode as Analysis Platform C program Normal Compiler Alias analysis, Call graph, Type information SAFECode C program with checks property Yes or No Software Verification e.g. ESP, BLAST SAFECode enforces core analyses, memory safety Core Analyses ≈

SAFECode Outline Motivation & Overview Background Approach Formalization Evaluation Summary

SAFECode Background - Alias Analysis P = malloc(2 * sizeof(int)); P[i] = …. struct BigT *Q = (Struct BigT *)P; TU S,A P Q field TK : Type Known, TU : Type Unknown struct List* head = makeList(20); struct List (TK) H next val head A static summary of memory objects and their connectivity Restriction: flow-insensitive, unification based Q->field8 = …

SAFECode Background - Automatic Pool Allocation (APA) [LattnerAdve:PLDI05] Each node instance uses separate pool Pool is destroyed if not accessible Pool 1Pool 2 List H next val head Partitions heap into pools based on alias analysis List H next val x y

SAFECode Outline Motivation & Overview Background Approach Formalization Evaluation Summary

SAFECode SAFECode Approach : Enforce Core Analyses Alias analysis Call graph –Run-time checks on indirect calls Type information –Subset of alias analysis

SAFECode Enforcing Alias Analysis Check if tmp points to corresponding node Normal allocators –Memory objects are scattered in the heap –Each check at run-time is extremely expensive struct List (TK) H next val tmp

SAFECode Insight 1 – Use Automatic Pool Allocation (APA) Each node instance uses separate pool Pool is destroyed if not accessible Pool 1Pool 2 List H next val head Partitions heap into pools based on alias analysis List H next val x y

SAFECode The Pool Bounds Check Pool is a list of pages (2^k) Pool maintains a hash table of the start addresses of the pages Poolcheck on a pointer p –Mask lower k bits of p, see if it is in the hash table –Alignment check for TK pools Poolcheck : involves hash lookups

SAFECode Insight 2 : Mostly static checking for TK pools 3 sufficient properties Type Known Pools Typed accessesfree Correct alignment free No pool bounds violations Pool bounds checks Type Unknown Pools Pool bounds checks on all operations Solution –Type homogeneity, do not release memory from pool ( Insight 3 ) Release memory from pool when pool is inaccessible (Insight 4)

SAFECode poolinit (ρ, int) PP { int* ρ x,y; int*ρ’ z; x = malloc(4); y = x; free(x); y = malloc(4); Formalization as a Type System Soundness theorem ensures core analyses are never invalidated Int xy ρ poolinit( ρ’, int) PP’ { poolinit (ρ, int) PP { int* ρ x,y; int*ρ’ z; x = poolalloc(PP, 1); //allocate one element y = x; //type checks poolfree(PP,x) y = poolalloc(PP,1); // malloc semantics different } ρ’ρ’ Int z

SAFECode Static Analysis Using SAFECode Flow-sensitive analysis –Only change is in malloc semantics Flow insensitive analyses –don’t require any changes e.g., ESP, BLAST Sound Analyses for C are now possible

SAFECode Evaluation (Run-time Overhead) Olden, Ptrdist, 3 system daemons [Full list in the paper] No source changes necessary Compared with CCured on Olden [See paper] ProgramSAFECode ratio bh1.03 bisort1.00 em3d1.27 treeadd0.99 tsp0.99 Yacr21.30 Ks1.12 anagram1.23 ftpd1.00 fingerd1.03 ghttpd ≡ no pool allocation + no SAFECode passes

SAFECode Related Work Modified C Pure C SolutionPerformance Error detection/ prevention Sound analysis Memory Management Purify, Valgrind Several 100xSome- - SafeC5xSome- - Jones-Kelley5-6xSome- - SFIOver 2xFew- - YongOver 2xSome- - SAFECodeUpto 1.30SomeYes - CCuredUpto 1.87AllYes GC Cyclone1x-2xAllYes Regions +GC

SAFECode Two errors we don’t detect Detecting array bounds overflow – A low overhead backwards-compatible solution [ICSE 2006] Detecting dangling pointer dereference –Efficient detection for some kinds of programs [DSN 2006]

SAFECode Conclusion Sound operational semantics for C + core analyses Guarantee alias analysis with low over head  We guarantee memory safety without detecting some errors - Control flow integrity - Data access integrity (type information) - Analysis integrity

SAFECode