Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Debugging using Dynamic Instrumentation (EDDI) Qin Zhao (Singapore-MIT Alliance) Rodric Rabbah (IBM TJ Watson Center) Saman Amarasinghe (CSAIL,

Similar presentations


Presentation on theme: "Efficient Debugging using Dynamic Instrumentation (EDDI) Qin Zhao (Singapore-MIT Alliance) Rodric Rabbah (IBM TJ Watson Center) Saman Amarasinghe (CSAIL,"— Presentation transcript:

1 Efficient Debugging using Dynamic Instrumentation (EDDI) Qin Zhao (Singapore-MIT Alliance) Rodric Rabbah (IBM TJ Watson Center) Saman Amarasinghe (CSAIL, MIT) Larry Rudolph (VMware) Weng-Fai Wong (National Univ of Singapore)

2 ETAPS-CC 2008EDDI – Zhao et al.2 Debugging is hard  Today’s applications are huge  Many files and components  Run on complex systems LoCFiles Paint.NET130K Gimp650K Blender1M OpenOffice10M Apache HTTP 2.090K275 PHP 5.1.6479K1K MySQL 5.0.25900K2K Firefox 1.5.0.22M11K Linux Kernel 2.6.174M16K Source: Wikipedia and M Squared Technologies

3 ETAPS-CC 2008EDDI – Zhao et al.3 Debugging today is very myopic  Inspect relatively simple predicates at individual program points

4 ETAPS-CC 2008EDDI – Zhao et al.4 Example using GDB (gdb) break dist_spu.c:19 (gdb) run (gdb) print cb $1 = {a_addr = 25286272, b_addr = 25269248, res_addr = 25269888, padding = 0} (gdb) cond 1 (cb.padding != 0) (gdb) run $2 = {a_addr = 25282312, b_addr = 2483423, res_addr = 25269888, padding = 10}

5 ETAPS-CC 2008EDDI – Zhao et al.5 The state of the art  An exaggeration of course…  … but how many of you use printf() for debugging? printf()

6 ETAPS-CC 2008EDDI – Zhao et al.6 A case of misplaces priorities?  Program instructions located in memory  Instructions are read from memory  Instructions manipulate memory  But debugging practices are not optimized for watching memory  Instruction breakpoints are quite fast  Watching memory is quite slow

7 ETAPS-CC 2008EDDI – Zhao et al.7 Breakpoint vs. Watchpoint Breakpoint  break when instruction at specific address executes Watchpoint  break when data at specific address mutates

8 ETAPS-CC 2008EDDI – Zhao et al.8 Typical support for watchpoints  Hardware support for small number of watchpoints  GDB uses one of four x86 debug breakpoint registers  Software fallback for large number of watchpoints  Single step execution and check linked list of watchpoints  More than 1000x slowdown observed

9 ETAPS-CC 2008EDDI – Zhao et al.9 Main insight underlying EDDI  Dynamic binary instrumentation can dramatically improve support for watchpoints  Watch orders of magnitude more locations than is feasible today  Better watchpoint support enables many new debugging features

10 ETAPS-CC 2008EDDI – Zhao et al.10 Currently feasible? EDDI Break on invariant violations to complex data structures maybeyes Break if objects and variables are used before initialization maybeyes Break if function return addresses mutate noyes Break on buffer overflowyes Examples of new debugging capabilities facilitated by EDDI can provide all of these and other debugging features in a single unified framework

11 ETAPS-CC 2008EDDI – Zhao et al.11 Efficient Watchpoints using EDDI  Carefully crafted strategy featuring and combining  Fast-access shadow memory  Optimized watchpoint tracking data structure  Full instrumentation  Slow and detailed instrumentation of every memory access  Partial instrumentation  Focused heuristics for fast instrumentation  Compiler optimizations  Dynamic binary rewriting

12 ETAPS-CC 2008EDDI – Zhao et al.12 Outline  EDDI framework  Fast-access shadow memory  Full instrumentation  Partial instrumentation  Case studies  Future work

13 ETAPS-CC 2008EDDI – Zhao et al.13 EDDI Overview  Accelerate and extend debugger functionality by dynamic co-optimization of debugger and application code Debugger (e.g., GDB) User Application Command interpreter Front-End User Translate and dispatch command DBI Signals, IPC

14 ETAPS-CC 2008EDDI – Zhao et al.14 EDDI and Watchpoints  Associate guarding predicates with watched memory locations  Individual or aggregate addresses  Instrument potentially all memory operations  Check if operation modifies watched location  Update location if guarding predicate allows it  Otherwise interrupt execution

15 ETAPS-CC 2008EDDI – Zhao et al.15 Outline  EDDI framework  Fast-access shadow memory  Full instrumentation  Partial instrumentation  Case studies  Future work

16 ETAPS-CC 2008EDDI – Zhao et al.16 Shadow memory  On-demand shadow page tracks watchpoints (set of watched locations)  Shadow memory optimized for constant overhead  Lookup table stores displacement between application and shadow pages  Trade-off space for time Lookup Table Application Pages Shadow Pages..................

17 ETAPS-CC 2008EDDI – Zhao et al.17 Outline  EDDI framework  Fast-access shadow memory  Full instrumentation  Partial instrumentation  Case studies  Future work

18 ETAPS-CC 2008EDDI – Zhao et al.18 Instrumentation  DBI instruments application code to monitor reads and writes from/to memory 1. Save context 2. Lookup address in shadow memory 3. Handle watched address according to user commands 4. Restore context and resume execution

19 ETAPS-CC 2008EDDI – Zhao et al.19 Example of full instrumentation  Context Save  Lines 1-6  Address Calculation  Line 7  Tag Checks  Lines 8-15  Context Restore  Lines 16-20 01: mov %ecx -> [ECX_slot] ! Save register 02: mov %eax -> [EAX_slot] 03: seto [OF_slot + 3] ! Save oflag 04: lahf ! Save eflags 05: mov %eax -> [AF_slot] 06: mov [EAX_slot] -> %eax ! Restore eax 07: lea [%eax, %ebx] -> %ecx ! Get address ! Compute table index 08: shr %ecx, $12 -> %ecx ! Shift right 09: cmp table[%ecx, 4], $0 ! Check entry 10: je 16: ! Check if tag is set to ‘watched’ 11: add %eax, table[%ecx, 4] -> %eax 12: testb $0xAA, [%eax, %ebx] 13: jz 15: 14: trap ! Trap 15: sub %eax, table[%ecx, 4] -> %eax 16: mov [AF_slot] -> %eax ! Restore all ! Restore oflag by triggering overflow ! if necessary 17: add [OF_slot], $0x7f000000 -> [OF_slot] 18: sahf ! Restore eflags 19: mov [EAX_slot] -> %eax 20: mov [ECX_slot] -> %ecx

20 ETAPS-CC 2008EDDI – Zhao et al.20 Experimental Results  SPEC 2000 (GCC 4.0 –O3)  2.66 GHz Intel Core 2 with 2GB RAM  Linux FC4

21 ETAPS-CC 2008EDDI – Zhao et al.21 Full instrumentation overhead: Slowdown compared to native

22 ETAPS-CC 2008EDDI – Zhao et al.22 Lowering instrumentation overhead  Classic optimizations  Context switch reduction  Group checks  Local variables check elimination  Watchpoint specific optimizations  Merge checks  Stack displacement  Reduce overhead for stack variables overhead via shadow stack

23 ETAPS-CC 2008EDDI – Zhao et al.23 Optimized instrumentation: Slowdown compared to native

24 ETAPS-CC 2008EDDI – Zhao et al.24 Performance overhead as a function of watchpoints

25 ETAPS-CC 2008EDDI – Zhao et al.25 Outline  EDDI framework  Fast-access shadow memory  Full instrumentation  Partial instrumentation  Case studies  Future work

26 ETAPS-CC 2008EDDI – Zhao et al.26 Partial instrumentation  Key idea: two-stage instrumentation  Coarse grained fast checks to entire pages  Fine grained instrumentation within a page when necessary 1.Protect pages containing watched data locations 2.Catch SIGSEGV signals when access to protected page occurs 3.Instrument code for fine-grained watchpoint checks

27 ETAPS-CC 2008EDDI – Zhao et al.27 PI: rewrite after SIGSEGV hit mov %ecx  [ECX_SLOT] ! steal ecx lea [%eax+0x10]  %ecx ! calculate address... ! save eflags shr %ecx, 20  %ecx ! right shift cmp table[%ecx], $0 ! check table entry je LABEL_ORIG... ! check tag status... ! restore eflags and ecx mov 0  [%eax + 0x030010] ! redirected reference jmp LABEL_NEXT LABEL_ORIG... ! restore eflags and ecx mov 0  [%eax+0x10] ! access original location LABEL_NEXT:... ! continue execution

28 ETAPS-CC 2008EDDI – Zhao et al.28 Performance evaluation  Randomly select heap objects to watch  Intercept malloc  Randomly allocated object from protected page or non-protected page  Object sizes vary

29 ETAPS-CC 2008EDDI – Zhao et al.29 Runtime overhead using partial instrumentation Overhead vs. native No. of Redirects Watched Objects No. of SIGSEGV 164.gzip1.0741.45×10 8 2042345 175.vpr1.0471.04×10 6 1000076 176.gcc1.2561.51×10 6 122 181.mcf1.5151.08×10 10 1468 186.crafty4.2502.77×10 8 37443 252.eon1.1533.50×10 1 18 253.perlbmk2.6917.69×10 7 1249 255.vortex1.5891.65×10 9 10219 256.bzip1.8429.01×10 9 7541 300.twolf1.0372.78×10 5 1100470

30 ETAPS-CC 2008EDDI – Zhao et al.30 Outline  EDDI framework  Fast-access shadow memory  Full instrumentation  Partial instrumentation  Case studies  Future work

31 ETAPS-CC 2008EDDI – Zhao et al.31 Example cases studies: demonstrate value of many watchpoints  Security  Catch mutations to function return addresses  Buffer overflow  Successfully caught all overflows in Wilander suite  Program analysis  Catch all accesses to unintialized variables  83% - 250% slower than native for 181.mcf  Dynamic pointer analysis  Identified all static instructions that accessed heap objects of specific type

32 ETAPS-CC 2008EDDI – Zhao et al.32 The value of having many watchpoints: Case Study 1  Watch for Return Address Access  some functions try to obtain current pc  a watchpoint is automatically  Set on the return address of a function when it is called.  Cleared on return  Ret, setjmp

33 ETAPS-CC 2008EDDI – Zhao et al.33  Dynamic Pointer Analysis  Using 181.mcf  Watch all 33,112 instances of node data-type  Identified 468 (static) instructions accessed objects of such type 1.08 × 10 10 times during execution The value of having many watchpoints: Case Study 2

34 ETAPS-CC 2008EDDI – Zhao et al.34  Read Un-initialized Variable  Again using 181.mcf  Changed calloc() to malloc()  Watch all malloc ’ed memory  When a location is initialized, watchpoint is cleared  the first uninitialized read occurs in 0.001 secs from the start of execution  EDDI reports the error in 0.037 secs  Overall, the instrumented execution is 83% slower using PI and 250% slower using FI The value of having many watchpoints: Case Study 3

35 ETAPS-CC 2008EDDI – Zhao et al.35  Software Security  Using the 20 Wilander Buffer Overflow Benchmarks  Watched the end of all buffers  Successfully identified all violations The value of having many watchpoints: Case Study 4

36 ETAPS-CC 2008EDDI – Zhao et al.36 Summary  Efficient debugging using dynamic instrumentation enables new opportunities that increase feature set available for debugging  Paper demonstrates using EDDI to significantly improve support for debugging using watchpoints  Practical to watch millions of memory locations with 3x average slowdown  Large number of watchpoints make it possible to explore new debugging scenarios  Holistic debugging methodology

37 ETAPS-CC 2008EDDI – Zhao et al.37 Main thrust for future work  EDDI for multicores and parallel program  Main idea: rather than watch execution and interleaving to catch data races and deadlocks…  … watch memory, record accesses, and on a data race or deadlock, inspect records to determine source of bug


Download ppt "Efficient Debugging using Dynamic Instrumentation (EDDI) Qin Zhao (Singapore-MIT Alliance) Rodric Rabbah (IBM TJ Watson Center) Saman Amarasinghe (CSAIL,"

Similar presentations


Ads by Google