Presentation is loading. Please wait.

Presentation is loading. Please wait.

PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 1 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ●

Similar presentations


Presentation on theme: "PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 1 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ●"— Presentation transcript:

1 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 1 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

2 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 2 Data Structures: Goal Retargetable Reliable Extensible Easy to manipulate ➔ abstract architecture specific details ✔ CFG ✗ SSA ►model control flow conservatively ◄precise = not too conservative ➔ easy to augment basic data structures with extra data

3 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 3 Transition in small steps Link Graph Direct Control Flow and Relocatable Address Graph Augmented Whole Program Control Flow Graph Interprocedural Control Flow Graph Input Data Structures Coarse-grained graph,enables linking and unused section removal Fine-grained graph, enables unreachable code and data removal = DCFRAG + special edges Enables (flow)analysis of the program Makes analysis of the program more easy

4 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 4 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

5 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 5 Linker Data Structures: our input ● Most are simply an abstract representation of data used by a linker: – archives (containers of relocatable objects) – relocatable objects (container of sections) – section (containers of data) ● Interesting structures for Diablo: – relocs (relocation information) – symbols

6 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 6 ● Represent every entity that can need relocating – have an address and a size – can be used in symbols and relocs – can be changed by relocs ● e.g. sections are relocatable objects Relocatable objects

7 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 7 ● labeled address expression – use the value of relocatable objects to calculate an address – example: ● symbol “offset_between_section_a_and_b_plus_10” ● code = “R01 R00 – A00 +$” ● used during symbol resolution – order ● > means it overwrites other symbols – can create data (e.g. bss) Symbols section a section b 10 offset in a offset in b

8 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 8 Relocations ● use the address of relocatable objects and symbols ● to compute an address ● put it in the desired encoding ● write it somewhere in a relocatable object ● checks if relocation was successful (no overflow)

9 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 9 CHECK (not used) CHECK (not used) COMPUTE ENCODE Relocations ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$”

10 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 10 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations

11 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 11 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations

12 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 12 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations

13 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 13 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations

14 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 14 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations

15 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 15 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations MARKERS

16 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 16 TO FROM Relocations ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” section a symbol 42 offset in a section c offset in c section b offset in b

17 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 17 bad_pointer: code = S00A00+... Relocation subtleties #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; } v1 code = R00.text 0 0 pointer: code = S00 A00+... 0 0 1 1 v2 code = R00.text 20

18 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 18 Relocation subtleties.text code = R00.text 0 0 bad_pointer: code = S00A00+... pointer: code = S00 A00+... 0 0 21 #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; }

19 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 19 Relocation subtleties.text code = R00.text 0x0 bad_pointer: code = S00A00+... pointer: code = S00 A00+... 0 0 21 #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; } RELOCATIONS & SYMBOLS SHOULD NOT BE RELAXED

20 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 20 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

21 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 21 Object file b Object files Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$

22 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 22 Object file b Graph representation Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$

23 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 23 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

24 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 24 EXE Reading information Entry MAP

25 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 25 EXE Reading information Entry MAP.text 0x8048120 0x5e4d40.text 0x8048120 0x24 /usr/lib/crt1.o 0x8048120 _start.text 0x8048144 0x22 /usr/lib/crti.o *fill* 0x8048166.text 0x8048170 0xc4 /usr/lib/crtbeginT.o *fill* 0x8048234.text 0x8048240 0x56f app_procs.o 0x8048300 app_run 0x80482a0 app_abort 0x80482e0 app_exit 0x8048240 app_libs_init *fill* 0x80487af.text 0x80487b0 0xf33 main.o 0x80487b0 main

26 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 26 Object file b EXE Object file a Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$ MAP

27 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 27 Object file b EXE Object file a Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$ MAP

28 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 28 Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$

29 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 29 Symbol Resolution Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$

30 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 30 Symbol Resolution Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code

31 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 31 Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code

32 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 32 Code Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Entry Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

33 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 33 Uses of the linkgraph ✔ Linking (placing sections, relocating) ✔ Apply linker optimizations (remove unused sections) ✗ fine-grained transformations ➔ Need for a more fine-grained graph: DCFRAG Direct Control Flow and Relocatable Addresses

34 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 34 Code Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Entry Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

35 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 35 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Disassembler Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

36 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 36 Disassembler ● Disassemble – Architecture independent analyses and optimizations work on architecture independent part or use callbacks Architecture independent - instruction type (jump, cmp,...) - conditional? - register lists (used, defined) Architecture independent - instruction type (jump, cmp,...) - conditional? - register lists (used, defined) Instruction Architecture dependent - backend decides - opcode - regs Architecture dependent - backend decides - opcode - regs

37 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 37 Detect basic blocks ● Disassemble ● Split sections into basic blocks Linked list of instructions Type (for special block) Linked list of instructions Type (for special block) Basic Block

38 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 38 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Detect basic blocks Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

39 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 39 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Detect basic blocks Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

40 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 40 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

41 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 41 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

42 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 42 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

43 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 43 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

44 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 44 Add edges ● Disassemble ● Split sections into basic blocks – Targets direct jumps + successors conditional jumps – To's of relocs – analyze switches (computed) to find switch targets ● Add direct control flow edges and switch edges Connector for two basic blocks Type (jump, ft, call, return,...) Connector for two basic blocks Type (jump, ft, call, return,...) Edge

45 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 45 Add edges Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

46 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 46 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: direct jumps Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

47 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 47 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: fall-through paths Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

48 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 48 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: system calls Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Sys Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

49 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 49 Partition the code into functions - Name - list of basic blocks - register lists (used, defined,...) - Name - list of basic blocks - register lists (used, defined,...) Function

50 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 50 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Partition the code into functions Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Sys Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data

51 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 51 Function a Partition the code into functions Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$

52 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 52 Function callee Function caller Function Calls Bbl Return Bbl Return

53 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 53 Function callee Function caller Interprocedural Goto's Bbl Return Bbl Return

54 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 54 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$

55 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 55 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Data

56 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 56 Uses of the DCFRAG ✔ Fine grained removal of unused code and data ✗ Dataflow ➔ We need a graph for data flow analyses (ICFG)

57 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 57 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Data

58 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 58 Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Function a Augmented Whole-Program CFG Function _start Function syscall hell Function program_entry Function hell Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return Data

59 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 59 Function a ICFG Function _start Function syscall hell Function program_entry Function hell Function b Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return

60 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 60 Uses of the ICFG ✔ Good for analysis and transformations ✗ Bad for writing out the program ➔ Use the combined ICFG + DCFRAG = AWPCFG

61 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 61 Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Function a AWPCFG Function _start Function syscall hell Function program_entry Function hell Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return Data

62 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 62 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

63 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 63 Concrete Data Structures t_relocatable t_symbol t_reloc_ref t_symbol_table t_reloc t_reloc_table t_ins t_bbl t_section t_cfg t_object t_arm_ins t_i386_ins t_symbol_ref t_cfg_edge t_ppc_ins t_function

64 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 64 Accessing fields ● With getters and setters ● t_bbl * head = CFG_EDGE_HEAD(cfg_edge) ● ARM_INS_SET_REGA(arm_ins, reg) ● Reason: Dynamic Members

65 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 65 Iterating the ICFG ● t_cfg * cfg; t_function * fun; t_bbl * bbl; t_ins * ins; t_cfg_edge * edge; ● CFG_FOREACH_FUNCTION(cfg, fun) – FUNCTION_FOREACH_BBL(fun, bbl) ● CFG_FOREACH_BBL(cfg, bbl) ● BBL_FOREACH_INS(bbl, ins) ● BBL_FOREACH_SUCC_EDGE(bbl, edge) ● BBL_FOREACH_PRED_EDGE(bbl, edge)

66 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 66 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

67 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 67 Primitive Transformations ● Relocations – Add: RelocTableAddRelocToRelocatable – Remove: RelocTableRemoveReloc – Modify: RelocSetFrom, RelocSetToRelocatable ● Sections – Create: SectionCreateForObject ● Functions – Create: FunctionMake – Remove: FunctionKill

68 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 68 Primitive Transformations ● Basic blocks – Create: BblNew – Remove: BblKill – Duplicate: BblDup – Split: BblSplitBlock ● ICFG edges – Create: CfgEdgeCreate – Remove: CfgEdgeKill ● Instructions – Create: InsNewForBbl – Remove: InsKill

69 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 69 On the DCFRAG ● SECTION_REFED_BY(sec) ● SECTION_REFERS_TO(sec) ● BBL_REFED_BY(bbl) ● BBL_REFED_BY_SYM(bbl) ● INS_REFERS_TO(ins)

70 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 70 Consistency of the AWPCFG ● Manipulate one view, what happens on other? ● Diablo tries to keep things consistent – Kill reloc, and its ICFG-edge is also killed – Kill ins, and to-relocs are also killed – Remove a section, and all to-relocs are also killed ● Makes sure that you do the proper thing – Try to kill an object with refers_to relocs, and it fatals

71 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 71 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members

72 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 72 Dynamic Members ● Diablo needs to be extensible ● Many analyses compute different kinds of information on very large data sets ● We cannot include all of them in the main libraries, or even store all information together ● Dynamic members augment basic data structures with members that can be allocated on the fly

73 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 73 Dynamic Members ● Example: member for reachability – t_bool BBL_REACHABLE(t_bbl) – BBL_SET_REACHABLE(t_bbl, t_bool) – BblInitReachable(t_cfg *) ● Allocates space for this field and calls init callback for each bbl – BblFiniReachable(t_cfg *) ● Calls fini callback and deallocates space

74 PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 74 Dynamic Members To instantiate the member: DYNAMIC_MEMBER( bbl,/* data structure to extend */ t_cfg *,/* manager type */ bbl_reachable_array, /* array to hold members */ t_bool, /* the type of the member */ reachable, /* lowercase name */ REACHABLE, /* UPPERCASE name */ Reachable, /* CamelCase name*/ CFG_FOREACH_BBL,/* iterator */ BblReachableInitCb, /* init callback*/ BblReachableFiniCb, /* fini callback */ BblReachableDupCb, /* dup callback */ );


Download ppt "PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 1 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ●"

Similar presentations


Ads by Google