Presentation is loading. Please wait.

Presentation is loading. Please wait.

May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To.

Similar presentations


Presentation on theme: "May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To."— Presentation transcript:

1 May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To

2 May 9, 2001 OSQ Retreat2 What are we doing?  Add run-time checks to C programs  Catch pointer and array errors  Minimal user effort More effort yields more performance  Make C “feel” as safe as Java

3 May 9, 2001 OSQ Retreat3 Motivation  50% of software errors are due to pointers  50% of security errors due to buffer overruns  Such errors are often hard to reproduce  Difficult to locate true source of errors

4 May 9, 2001 OSQ Retreat4 Overview  Motivation and System Goals  Checkable Errors  Run-Time Representation  Static Analysis  Preliminary Results  Future Work

5 May 9, 2001 OSQ Retreat5 Goals  Support existing C code Compatibility with external libraries Handle GCC/MSVC source, Makefiles  Efficiency: 50% overhead rather than 1000% Research: 5x, Purify 10x, BoundsChecker 150x  Default: many checks Reduce by static analysis and/or user annotations

6 May 9, 2001 OSQ Retreat6 Checkable Errors  Array and pointer bounds checks Well-understood  Dereferencing a non-pointer (or NULL) Complicated by casts and unions  Pointer arithmetic outside of object bounds Not always caught by Purify, etc.  Freeing non-pointers, using freed memory

7 May 9, 2001 OSQ Retreat7 Required Information  Checks require information about pointers Length, base, capabilities, etc.  Can be stored in a global table High table-lookup overhead: 500%  Can be stored with each pointer struct { Foo *p; Foo *base; Foo *end; } SafeFoo Library compatibility is tricky

8 May 9, 2001 OSQ Retreat8 More is Needed: Tags Must keep track of which locations are valid pointers Use per-object tags (like in GC) int **X; int *Y; *Y = 55; // OK X = Y; printf(“%d”,**X); // CRASH!

9 May 9, 2001 OSQ Retreat9 Run-Time Representation  Associate with each object in memory: Base (lower bound), End (upper bound) Tags (bitfield: 1 bit per word: is it a valid pointer?) Checks bounds on every access, check tags on pointer reads, set tags on every write  Example: struct { int x; int *y; } *p; 01endxy tags basep

10 May 9, 2001 OSQ Retreat10 Kinds of Pointers  Many pointers only move forward (no casts) Notably C strings: for (; *p; p++) if *p==‘c’ … Such “forward” pointers need only an end bound  Many pointers are not involved in evil casts But may use pointer arithmetic: arrays Such “index” pointers need not carry tags

11 May 9, 2001 OSQ Retreat11 Kinds of Pointers  Many pointers are completely “safe” No evil casts, no arithmetic, etc. e.g., FILE * fin = fopen(“input”, “r”); These can be represented without any extra information (just a NULL check when used)  These cases yield better performance!

12 May 9, 2001 OSQ Retreat12 Physical Subtyping  Define a formal notion of representation equality and subtyping for casts Keep pointers and scalars separate!  Intuition: struct {char a[4];} = struct {int x;} struct {char a[4];}  struct {int *x;} struct {int a; int b;}  struct {int a;}

13 May 9, 2001 OSQ Retreat13 Extended Type System  Simplified C types:   ::= int |  ref q |  1   2 |  1 +  2 -- Types  q ::= safe | string | seq | wild -- Qualifiers  safe = one word: standard C pointer  seq = three words: pointer, base, end  wild = two words: pointer, base, end, tags

14 May 9, 2001 OSQ Retreat14 Type System (continued)   ref wild,  must contain only wild pointers  May cast between safe and seq  wild may only be cast to or from wild  Physical equivalence: short  short = int a  (b+c)= (a  b)+(a  c)  Width subtyping:  1   2   1

15 May 9, 2001 OSQ Retreat15 Some Typing Rules O ` &e :  1   2 ref q 1 q 2 = (if q 1 = wild then wild else safe) O ` &e.L :  1 ref q 2 address of field O ` e 1 :  ref q q  safe O ` e 2 : int O ` e 1 +e 2 :  ref q pointer arithmetic

16 May 9, 2001 OSQ Retreat16 Handling Casts O ` e :  1 ref q 1  1 ref q 1   2 ref q 2 O ` (  2 ref q 2 )e :  2 ref q 2 cast between pointers Initial q 1 Final q 2 Constraint safe  1 =  2 seqsafe  k.  1 [k]   2 seq  j,k.  1 [j] =  2 [k] wild None When is  1 ref q 1   2 ref q 2 ?

17 May 9, 2001 OSQ Retreat17 Static Analysis & Inference  For every pointer in the program Try to infer the fastest safe representation This is like eliminating classes of run-time checks we know will never fail  Can be formulated as constraint-solving Apply subtyping rules to casts to get constraints O(E) where E is number of casts/assignments (flow insensitive)

18 May 9, 2001 OSQ Retreat18 Preliminary Results Default Overhead Reduced Overhead Check Overhead Overhead (GC) Reduced Overhead (GC) hashtest218%100%3%222%221% rbtest128%2%1%138%4% compress*36%0%37% barnes_hut108%37%109% mod_layout0% N/A

19 May 9, 2001 OSQ Retreat19 Future Work  Encode type information at run-time More expressive casts with low overhead More complete handling of function pointers  Handle C polymorphism Uses void*, requires vast overhead  Efficient memory management GC (or something else) takes free() as a hint

20 May 9, 2001 OSQ Retreat20 Conclusion  Can add efficient run-time checks to C Check bounds, valid pointers, frees, etc.  Static analysis is fast and useful  Can support existing C code Whole programs are considered safe pointers and wrappers for libraries  Default to many checks, infer them away


Download ppt "May 9, 2001OSQ Retreat 1 Run-Time Type Checking for Pointers and Arrays in C Wes Weimer, George Necula Scott McPeak, S.P. Rahul, Raymond To."

Similar presentations


Ads by Google