Presentation is loading. Please wait.

Presentation is loading. Please wait.

JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University.

Similar presentations


Presentation on theme: "JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University."— Presentation transcript:

1 JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University of Toronto

2 1 Instrumenting Operating Systems  Operating systems are growing in complexity  Kernel instrumentation can help Used for: debugging, profiling, monitoring, and security auditing...  Dynamic instrumentation No recompilation & no reboot Good for debugging systemic problems Feasible in production settings

3 2 Current Approach: Probe-Based  Dynamic instrumentation tools for OSs are probe based Overwrite existing code with jump/trap  Efficient on fixed length architectures  Slow on variable length architectures Not safe to overwrite multiple instructions with jump:  Branch to between instructions might exist  Thread might be sleeping in between the instructions Must use trap instruction

4 3 Current Approach: Trap-based Area of interestInstrumentation Code Trap Handler 1.Save processor state 2.Lookup which instrumentation to call 3.Call instrumentation 4.Emulate overwritten instruction 5.Restore processor state add $1,count_l adc $0,count_h inc 14(edx)int3 mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp Very Expensive!

5 4 Alternative: JIT Instrumentation  Propose to use just-in-time dynamic instrumentation  Rewrite code to insert new instructions in between existing ones More Efficient. More Powerful. Supports:  Instrumenting branch directions  Basic block-level instrumentation  Per execution-path instrumentation  Proven itself in user space (Pin, Valgrind)

6 5 JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) adc $0,count_h add $1,count_l Code Cache

7 6 popf JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx sub $6c,esp mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) adc $0,count_h add $1,count_l call instrmtn pushf Code Cache

8 7 JIT Instrumentation Instrumentation Code Area of Interest mov $ffffe000,edx and esp,edx mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) sub $6c,esp or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) mov $ffffe000,edx and esp,edx sub $6c,esp adc $0,count_h add $1,count_lpopf mov 28(edi),eax add $1,eax or $c, eax and $3,eax mov eax,(ebx) or $f, ebp mov ebp,4(ebx) mov 2c(edi),ebx mov 30(edi),ebp add $2,ebp inc 14(edx) call instrmtn pushf Code Cache

9 8 Dynamic Binary Rewriting  Use dynamic binary rewriting to insert the new instructions.  Interleaves binary rewriting with execution Performed by a runtime system Typically at basic block granularity  Code is rewritten into a code cache  Rewritten code must be: Efficient Unaware of its new location

10 9 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1

11 10 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1 bb2

12 11 Dynamic Binary Rewriting Original Code:Code Cache: bb1 Runtime System bb3bb2 bb4 bb1 bb2 bb4 bb1 bb2 No longer need to enter runtime system

13 12 Dynamic Binary Rewriting  Used for rewriting operating systems Virtualization (VMware) Emulation (QEMU)  Never used for instrumentation of OSs  Never used to rewrite host OS in a general manner Allows instrumentation of live system

14 13 Outline  Prototype (JIFL) Design OS Issues  Performance comparison Kprobes vs JIFL  Example Plugin Checking branch hint directions

15 14 Prototype Design  JIFL - JIT Instrumentation Framework for Linux Instruments code reachable from system calls

16 15 JIFL Software Architecture JIFL Plugin (Loadable Kernel Module) Linux Kernel (All code reachable from system calls) JIFL Plugin Starter User Space Kernel Space Code Cache Runtime System JIFL (Loadable Kernel Module) Heap Memory Allocator Dispatcher JIT Compiler JIFL Plugin Starter

17 16 Gaining Control  Runtime System must gain control before it can start rewriting/instrumenting OS  Update system call table entry to point to dynamically emitted entry stub Calls per-system call instrumentation Calls dispatcher  Passing original system call pointer

18 17 Dispatcher  Saves registers and condition code states  Dispatcher checks if target basic block is in code cache If so it jumps to this basic block Otherwise it invokes the JIT to compile and instrument the new basic block

19 18 JIT Compiler  Like conventional JIT compiler, except its input/output is x86 machine code  Compiles at a dynamic basic block granularity All but the last control flow instruction are copied directly into the code cache Control flow instructions are modified to account for the new location of the code  Communicates with the JIFL plugin to determine what instrumentation to insert

20 19 JIT: Inserting Instrumentation  Instrumentation is added by inserting a call instruction into the basic block  Additional instructions are also needed to: Push/Pop instrumentation parameters Save/Restore volatile registers (eax, edx, ecx) Save/Restore condition code register  Several optimizations can be performed to reduce instrumentation cost

21 20 Eliminating Redundant State Saving  Eliminate dead register and condition code saving code Perform liveness analysis Reduce state saving overhead  Per-basic block Instrumentation Search for the cheapest place to insert it

22 21 Inlining Instrumentation  Small instrumentation can be inlined into the basic block Removes the call and ret instructions  Constant parameters are propagated to remove stack accesses  Copy propagation and dead-code elimination is applied to specialize the instrumentation routine for context  All done on native x86 code. No IR!

23 22 Effect of Optimizations Normalized Execution Time  Average system call latencies with per-basic block instrumentation

24 23 Prototype  Operating System Issues

25 24 Memory Allocator  While JITing JIFL often needs to allocate dynamic memory  Cannot rely on Linux kmalloc and vmalloc routines as they are not reentrant  Instead, we created our own memory allocator  Pre-allocate a heap when JIFL starts up

26 25 Releasing Control  Calls to schedule() have to be redirected Otherwise, JIFL keeps control even after context switch Have to:  Save return address in hash table  Call schedule()  Look up and call dispatcher

27 26 Performance Comparison  JIFL vs. Kprobes

28 27 Performance Evaluation  Instrument every system call with three types of instrumentation: System Call Monitoring (Coarse Grained) Call Tracing (Medium Grained) Basic Block Counting (Fine Grained)  LMbench and ApacheBench2 benchmarks  Test Setup 4-way Intel Pentium 4 Xeon SMP - 2.8GHz Linux 2.6.17.13  With SMP support and no preemption

29 28 System Call Monitoring Normalized Execution Time

30 29 Call Tracing Normalized Execution Time Log Scale

31 30 Basic Block Counting Normalized Execution Time Log Scale

32 31 Apache Throughput Normalized Requests / Second

33 32 Example Plugin  Checking Correctness of Branch Hints

34 33 Example Plugin: Checking Branch Hints Int correct_count  0 Int incorrect_count  0 // Called for every newly discovered basic block. Procedure Basic_Block_Callback if last instruction is not a hinted branch return if hinted in the branch not taken direction call Insert_Branch_Not_Taken_Instrumentation( Increment_Counter, &correct_count) call Insert_Branch_Taken_Instrumentation( Increment_Counter, &incorrect_count) else // Insert same instrumentation but for reverse // branch directions // Executed for every instrumented branch. Procedure Increment_Counter(Int *Counter) *Counter  *Counter + 1

35 34 Example Plugin: Checking Branch Hints  5 system calls with bad branch hint performance Misprediction rates > 75% Contained > 30% of hinted branch executed  Examined using a second plugin Monitored individual branches Found 4 greatest contributors Mapped back to source code  Can’t fix: Not hinted by programmer!

36 35 Conclusions  JIT instrumentation viable for operating systems Developed a prototype for the Linux kernel (JIFL)  Results are very competitive JIFL outperforms Kprobes by orders of magnitude  Enables more powerful instrumentation e.g. Branch Hints


Download ppt "JIT Instrumentation – A Novel Approach To Dynamically Instrument Operating Systems Marek Olszewski Keir Mierle Adam Czajkowski Angela Demke Brown University."

Similar presentations


Ads by Google