Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Maryland Smarter Code Generation for Dyninst Nick Rutar.

Similar presentations

Presentation on theme: "University of Maryland Smarter Code Generation for Dyninst Nick Rutar."— Presentation transcript:

1 University of Maryland Smarter Code Generation for Dyninst Nick Rutar

2 University of Maryland Why do we need better code generation? Dyninst has evolved through its releases –Originally designed with Paradyn in mind Frequent changes to instrumentation Current code generation requirements –Not have adverse effects on pre-existing program –Tuned to handle future changes to instrumentation Certain optimizations currently in compilers can be used for Dyninst –Dataflow analysis –Register allocation Because it is a dynamic environment certain modifications need to be performed

3 University of Maryland Methods to Improve Code Generation Decrease Register Spills –No Function Call Save only registers generated by a mini-tramp –Function Call Do Analysis to see which registers need saving Merge Base Tramp & Mini-Tramp –Need to create flag after first instrumentation –Only one mini-tramp created per site Dataflow analysis for Dead Registers –Useful for arbitrary instrumentation points

4 University of Maryland Current Register Implementation Base tramp –Saves/restores all volatile (caller-save) registers Mini tramp –Uses volatile registers as needed Problems –Many small code snippets will have minimal register usage –POWER platform 11 volatile GPR 14 volatile FPR

5 University of Maryland New Register Implementation Base Tramp Generation –Only registers explicitly used within base tramp are saved –Series of place holder noops are generated for those registers not saved –Jump created at last save/restore to end of noop group Mini Tramp Generation –Keeps track of all volatile registers used After Mini Tramp Generation –Noops are replaced within base tramp with save(s)/restore(s) –Jump is updated

6 University of Maryland Example (from POWER) Old Base Tramp (saves) stu r1,-540(r1) st r12,312(r1) st r11,308(r1) st r10,304(r1) st r9,300(r1) st r8,296(r1) st r7,292(r1) st r6,288(r1) st r5,284(r1) st r4,280(r1) st r3,276(r1) st r0,264(r1) stfd f0,152(r1) stfd f1,160(r1) stfd f2,168(r1) stfd f3,176(r1) stfd f4,184(r1) stfd f5,192(r1) stfd f6,200(r1) stfd f7,208(r1) stfd f8,216(r1) stfd f9,224(r1) stfd f10,232(r1) stfd f11,240(r1) stfd f12,248(r1) stfd f13,256(r1) Mini Tramp liu r12,8192 l r12,1416(r12) cal r11,1(r12) liu r10,8192 st r11,1416(r10) br GPRGPR FPRFPR Old Base Tramp (restores) l r12,312(r1) l r11,308(r1) l r10,304(r1) l r9,300(r1) l r8,296(r1) l r7,292(r1) l r6,288(r1) l r5,284(r1) l r4,280(r1) l r3,276(r1) l r0,264(r1) lfd f0,152(r1) lfd f1,160(r1) lfd f2,168(r1) lfd f3,176(r1) lfd f4,184(r1) lfd f5,192(r1) lfd f6,200(r1) lfd f7,208(r1) lfd f8,216(r1) lfd f9,224(r1) lfd f10,232(r1) lfd f11,240(r1) lfd f12,248(r1) lfd f13,256(r1) cal r1,540r1) GPRGPR FPRFPR

7 University of Maryland Example (continued) New Base Tramp stu r1,-540(r1) st r12,312(r1) st r11,308(r1) st r10,304(r1) st r6,288(r1) st r5,284(r1) st r0,264(r1) stfd f10,232(r1) b nop. nop brl Reduces Base Tramp by 34 instructions –Eliminate 18 Saves, 18 Restores –Add two jumps Mini-Tramp liu r12,8192 l r12,1416(r12) cal r11,1(r12) liu r10,8192 st r11,1416(r10) br

8 University of Maryland Experiments (POWER) Simple mutatee –for (a = 0; a < 0xfffff; a++) { x=x+a; x+= 5*a; if( x > 6000) x=2; else x *=4; } Instrumentation –Increments global variable by one Mini-tramp is six instructions –Inserted at every node on CFG for program Four base tramps for every iteration of loop

9 University of Maryland Results (POWER) Instructions Completed –Version – 30,393,823 –New Code Generation – 21,485,346 FPU produced a result –Version – 4,716,269 –New Code Generation – 1,310,013

10 University of Maryland Dealing with Function Calls Linear scan on instructions for function that is called from mini-tramp –Record all modified registers within function –Make recursive calls when needed At certain cut-off point assume all registers were clobbered

11 University of Maryland Merging Base & Mini Tramp Original Design Decisions for Dyninst made to use Paradyn’s instrumentation usage pattern –Large amount of instrumentation changed frequently Can generate better code for various reasons –Eliminates noops for registers in base tramp –Eliminates link register modifications and branches –Makes assembly more stream-lined And readable … if you’re in to that kind of thing One instrumentation point installed … that’s it –Functionality somewhat limited Tradeoff of speed for ease of further instrumentation –Delete then reinsert (Replace)

12 University of Maryland How will it work? Create flag for BPatch class in API –Once flag is set merging is set –When flag gets reset system reverts to old style –void setMergeTramp(bool x) Similar to recursion flag currently in Dyninst No effect on current Dyninst use –Default flag set to no merging Most users will probably leave it at one setting based on instrumentation needs

13 University of Maryland Mini-tramp operation comparison No Merging –Insert Same as before, unlimited per instrumentation site –Delete Deletes instance of mini-tramp Merging –Insert Only one mini-tramp allowed to be inserted, instrumentation point locked after first mini tramp generated –Delete Deletes instance of mini-tramp and base tramp –Replace Delete, then Insert new Possible to save AST information at the old mini tramp to be used for new instrumentation

14 University of Maryland Dataflow analysis for Dead Registers Register use after instrumentation –Overwritten before accessed We are free to use them in tramp without having to spill them –Not overwritten Spill to stay on cautious side Do analysis before tramp generation –Dead registers have highest priority –Currently same registers used regardless

15 University of Maryland Dataflow Analysis Example Uninstrumented Program. cal r11,1(r11) cal r10,3(r12) st r11,1416(r10) l r4, 280(r1) cal r3, 2(r11) st r10, 304(r1) **Potential Inst Point** cal r3, 2(r11) cal r4, 5(r10) l r10, 304(r1) l r11, 308(r1) Analyze code before and after an arbitrary instrumentation point Dead registers are given priority for register use in tramps

16 University of Maryland Other Speed-ups for Dyninst Partial Parsing of functions –Grab symbol table and create function objects –Delay analysis until function is actually accessed –User can’t see non-symbol-table functions Therefore, We don’t have to worry about them

17 University of Maryland Status Completed and in New Release (POWER) –New Register Spilling for Basic Snippets Registers Spilled for Function Calls from a Mini Tramp –Partial Parsing (All platforms) Currently Being Implemented (POWER) –Linear Code Scan for function calls –Base Tramp, Mini Tramp Merging –Data Flow Analysis for Dead Registers Will eventually be on all platforms

18 University of Maryland Questions ???

Download ppt "University of Maryland Smarter Code Generation for Dyninst Nick Rutar."

Similar presentations

Ads by Google