May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To.

Slides:



Advertisements
Similar presentations
Variables in C Amir Haider Lecturer.
Advertisements

Programming Languages and Paradigms The C Programming Language.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
SPLINT STATIC CHECKING TOOL Sripriya Subramanian 10/29/2002.
Flow-Insensitive Points-to Analysis with Term and Set Constraints Presentation by Kaleem Travis Patrick.
Portability and Safety Mahdi Milani Fard Dec, 2006 Java.
Chapter 7:: Data Types Programming Language Pragmatics
C Programming - Lecture 5
Compiler Construction
SAFECode SAFECode: Enforcing Alias Analysis for Weakly Typed Languages Dinakar Dhurjati University of Illinois at Urbana-Champaign Joint work with Sumant.
Introduction The Approach ’ s Overview A Language of Pointers The Type System Operational Semantics Type Safety Type Inference The Rest of C Experiments.
6/10/2015C++ for Java Programmers1 Pointers and References Timothy Budd.
Strength Through Typing: A more powerful dependently-typed assembly language Matt Harren George Necula OSQ 2004.
Dynamic Memory Allocation in C++. Memory Segments in C++ Memory is divided in certain segments – Code Segment Stores application code – Data Segment Holds.
Type-Safe Programming in C George Necula EECS Department University of California, Berkeley.
1 Chapter 4 Language Fundamentals. 2 Identifiers Program parts such as packages, classes, and class members have names, which are formally known as identifiers.
Mark Hennessy CS351 Dept Computer Science NUI Maynooth 1 Types CS351 – Programming Paradigms.
A Type-Checked Restrict Qualifier Jeff Foster OSQ Retreat May 9-10, 2001.
Prof. Fateman CS 164 Lecture 131 Types Lecture 13.
CS 61C L03 C Arrays (1) A Carle, Summer 2005 © UCB inst.eecs.berkeley.edu/~cs61c/su05 CS61C : Machine Structures Lecture #3: C Pointers & Arrays
Declaring and Checking Non-null Types in an Object-Oriented Language Authors: Manuel Fahndrich K. Rustan M. Leino OOPSLA’03 Presenter: Alexander Landau.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
CS61C Midterm Review on C & Memory Management Fall 2006 Aaron Staley Some material taken from slides by: Michael Le Navtej Sadhal.
CCured in the Real World Jeremy ConditMatthew Harren Scott McPeakGeorge Necula Westley Weimer OSQ Retreat May 14, 2003.
HARDBOUND: ARCHITECURAL SUPPORT FOR SPATIAL SAFETY OF THE C PROGRAMMING LANGUAGE Kyle Yan Yu Xing 2014/10/15.
CQual: A Tool for Adding Type Qualifiers to C Jeff Foster et al UC Berkeley OSQ Retreat, May
1 CMSC 132: Object-Oriented Programming II Java Constructs Department of Computer Science University of Maryland, College Park.
May 22, 2002OSQ Retreat 1 CCured: Taming C Pointers George Necula Scott McPeak Wes Weimer
Checking Memory Safety with BLAST Dirk Beyer, et al. FASE 2005 KAIST CS750b 2006 Fall Seonggun Kim.
Safety in the C programming Language Peter Wihl May 26 th, 2005 CS 297 Security and Programming Languages.
C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.
1 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University.
Java. Why Java? It’s the current “hot” language It’s almost entirely object-oriented It has a vast library of predefined objects It’s platform independent.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Type Checking and Data Type Implementation (Sections )
Natalia Yastrebova What is Coverity? Each developer should answer to some very simple, yet difficult to answer questions: How do I find new.
Presentation of Failure- Oblivious Computing vs. Rx OS Seminar, winter 2005 by Lauge Wullf and Jacob Munk-Stander January 4 th, 2006.
Algorithm Programming Bar-Ilan University תשס"ח by Moshe Fresko.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
CMPSC 16 Problem Solving with Computers I Spring 2014 Instructor: Tevfik Bultan Lecture 12: Pointers continued, C strings.
Chapter 0.2 – Pointers and Memory. Type Specifiers  const  may be initialised but not used in any subsequent assignment  common and useful  volatile.
JAVA: An Introduction to Problem Solving & Programming, 5 th Ed. By Walter Savitch and Frank Carrano. ISBN © 2009 Pearson Education, Inc., Upper.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Netprog: Java Intro1 Crash Course in Java. Netprog: Java Intro2 Why Java? Network Programming in Java is very different than in C/C++ –much more language.
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
COP4020 Programming Languages Names, Scopes, and Bindings Prof. Xin Yuan.
Java Basics.  To checkout, use: svn co scb07f12/UTORid  Before starting coding always use: svn update.
Copyright 2005, The Ohio State University 1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation.
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
Computer Graphics 3 Lecture 1: Introduction to C/C++ Programming Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
SOEN 343 Software Design Section H Fall 2006 Dr Greg Butler
Efficient Detection of All Pointer and Array Access Errors Todd M.Austin Scott E.Breach Gurindar S.Sohi Computer Sciences Department University of Wisconsin-Madison.
How to execute Program structure Variables name, keywords, binding, scope, lifetime Data types – type system – primitives, strings, arrays, hashes – pointers/references.
ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 – August 23, 2001.
CS 884 (Prasad)Java Types1 Language Specification Semantics of constructs. –Definition and use of name-value bindings: Name Resolution. Soundness : No.
Variables in C Topics  Naming Variables  Declaring Variables  Using Variables  The Assignment Statement Reading  Sections
Language-Based Security. Outline CQUAL CQUAL CCured CCured Valgrind Valgrind Memcheck, AddrcheckMemcheck, Addrcheck HelgrindHelgrind Applying on PttBBS.
CMSC 104, Version 8/061L09VariablesInC.ppt Variables in C Topics Naming Variables Declaring Variables Using Variables The Assignment Statement Reading.
Announcements You will receive your scores back for Assignment 2 this week. You will have an opportunity to correct your code and resubmit it for partial.
Java Basics. Tokens: 1.Keywords int test12 = 10, i; int TEst12 = 20; Int keyword is used to declare integer variables All Key words are lower case java.
Records type city is record -- Ada Name: String (1..10); Country : String (1..20); Population: integer; Capital : Boolean; end record; struct city { --
Language-Based Security: Overview of Types Deepak Garg Foundations of Security and Privacy October 27, 2009.
Semantic Analysis Type Checking
CSE 3302 Programming Languages
High Coverage Detection of Input-Related Security Faults
Object-Oriented Programming Part 1
Course Overview PART I: overview material PART II: inside a compiler
Variables in C Topics Naming Variables Declaring Variables
CSE 3302 Programming Languages
Presentation transcript:

May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To

May 9, 2001 OSQ Retreat2 What are we doing?  Add run-time checks to C programs  Catch pointer and array errors  Minimal user effort More effort yields more performance  Make C “feel” as safe as Java

May 9, 2001 OSQ Retreat3 Motivation  50% of software errors are due to pointers  50% of security errors due to buffer overruns  Such errors are often hard to reproduce  Difficult to locate true source of errors

May 9, 2001 OSQ Retreat4 Overview  Motivation and System Goals  Checkable Errors  Run-Time Representation  Static Analysis  Preliminary Results  Future Work

May 9, 2001 OSQ Retreat5 Goals  Support existing C code Compatibility with external libraries Handle GCC/MSVC source, Makefiles  Efficiency: 50% overhead rather than 1000% Research: 5x, Purify 10x, BoundsChecker 150x  Default: many checks Reduce by static analysis and/or user annotations

May 9, 2001 OSQ Retreat6 Checkable Errors  Array and pointer bounds checks Well-understood  Dereferencing a non-pointer (or NULL) Complicated by casts and unions  Pointer arithmetic outside of object bounds Not always caught by Purify, etc.  Freeing non-pointers, using freed memory

May 9, 2001 OSQ Retreat7 Required Information  Checks require information about pointers Length, base, capabilities, etc.  Can be stored in a global table High table-lookup overhead: 500%  Can be stored with each pointer struct { Foo *p; Foo *base; Foo *end; } SafeFoo Library compatibility is tricky

May 9, 2001 OSQ Retreat8 More is Needed: Tags Must keep track of which locations are valid pointers Use per-object tags (like in GC) int **X; int *Y; *Y = 55; // OK X = Y; printf(“%d”,**X); // CRASH!

May 9, 2001 OSQ Retreat9 Run-Time Representation  Associate with each object in memory: Base (lower bound), End (upper bound) Tags (bitfield: 1 bit per word: is it a valid pointer?) Checks bounds on every access, check tags on pointer reads, set tags on every write  Example: struct { int x; int *y; } *p; 01endxy tags basep

May 9, 2001 OSQ Retreat10 Kinds of Pointers  Many pointers only move forward (no casts) Notably C strings: for (; *p; p++) if *p==‘c’ … Such “forward” pointers need only an end bound  Many pointers are not involved in evil casts But may use pointer arithmetic: arrays Such “index” pointers need not carry tags

May 9, 2001 OSQ Retreat11 Kinds of Pointers  Many pointers are completely “safe” No evil casts, no arithmetic, etc. e.g., FILE * fin = fopen(“input”, “r”); These can be represented without any extra information (just a NULL check when used)  These cases yield better performance!

May 9, 2001 OSQ Retreat12 Physical Subtyping  Define a formal notion of representation equality and subtyping for casts Keep pointers and scalars separate!  Intuition: struct {char a[4];} = struct {int x;} struct {char a[4];}  struct {int *x;} struct {int a; int b;}  struct {int a;}

May 9, 2001 OSQ Retreat13 Extended Type System  Simplified C types:   ::= int |  ref q |  1   2 |  1 +  2 -- Types  q ::= safe | string | seq | wild -- Qualifiers  safe = one word: standard C pointer  seq = three words: pointer, base, end  wild = two words: pointer, base, end, tags

May 9, 2001 OSQ Retreat14 Type System (continued)   ref wild,  must contain only wild pointers  May cast between safe and seq  wild may only be cast to or from wild  Physical equivalence: short  short = int a  (b+c)= (a  b)+(a  c)  Width subtyping:  1   2   1

May 9, 2001 OSQ Retreat15 Some Typing Rules O ` &e :  1   2 ref q 1 q 2 = (if q 1 = wild then wild else safe) O ` &e.L :  1 ref q 2 address of field O ` e 1 :  ref q q  safe O ` e 2 : int O ` e 1 +e 2 :  ref q pointer arithmetic

May 9, 2001 OSQ Retreat16 Handling Casts O ` e :  1 ref q 1  1 ref q 1   2 ref q 2 O ` (  2 ref q 2 )e :  2 ref q 2 cast between pointers Initial q 1 Final q 2 Constraint safe  1 =  2 seqsafe  k.  1 [k]   2 seq  j,k.  1 [j] =  2 [k] wild None When is  1 ref q 1   2 ref q 2 ?

May 9, 2001 OSQ Retreat17 Static Analysis & Inference  For every pointer in the program Try to infer the fastest safe representation This is like eliminating classes of run-time checks we know will never fail  Can be formulated as constraint-solving Apply subtyping rules to casts to get constraints O(E) where E is number of casts/assignments (flow insensitive)

May 9, 2001 OSQ Retreat18 Preliminary Results Default Overhead Reduced Overhead Check Overhead Overhead (GC) Reduced Overhead (GC) hashtest218%100%3%222%221% rbtest128%2%1%138%4% compress*36%0%37% barnes_hut108%37%109% mod_layout0% N/A

May 9, 2001 OSQ Retreat19 Future Work  Encode type information at run-time More expressive casts with low overhead More complete handling of function pointers  Handle C polymorphism Uses void*, requires vast overhead  Efficient memory management GC (or something else) takes free() as a hint

May 9, 2001 OSQ Retreat20 Conclusion  Can add efficient run-time checks to C Check bounds, valid pointers, frees, etc.  Static analysis is fast and useful  Can support existing C code Whole programs are considered safe pointers and wrappers for libraries  Default to many checks, infer them away