Download presentation
Presentation is loading. Please wait.
Published byJulian Cameron Modified over 8 years ago
1
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 1 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
2
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 2 Data Structures: Goal Retargetable Reliable Extensible Easy to manipulate ➔ abstract architecture specific details ✔ CFG ✗ SSA ►model control flow conservatively ◄precise = not too conservative ➔ easy to augment basic data structures with extra data
3
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 3 Transition in small steps Link Graph Direct Control Flow and Relocatable Address Graph Augmented Whole Program Control Flow Graph Interprocedural Control Flow Graph Input Data Structures Coarse-grained graph,enables linking and unused section removal Fine-grained graph, enables unreachable code and data removal = DCFRAG + special edges Enables (flow)analysis of the program Makes analysis of the program more easy
4
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 4 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
5
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 5 Linker Data Structures: our input ● Most are simply an abstract representation of data used by a linker: – archives (containers of relocatable objects) – relocatable objects (container of sections) – section (containers of data) ● Interesting structures for Diablo: – relocs (relocation information) – symbols
6
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 6 ● Represent every entity that can need relocating – have an address and a size – can be used in symbols and relocs – can be changed by relocs ● e.g. sections are relocatable objects Relocatable objects
7
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 7 ● labeled address expression – use the value of relocatable objects to calculate an address – example: ● symbol “offset_between_section_a_and_b_plus_10” ● code = “R01 R00 – A00 +$” ● used during symbol resolution – order ● > means it overwrites other symbols – can create data (e.g. bss) Symbols section a section b 10 offset in a offset in b
8
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 8 Relocations ● use the address of relocatable objects and symbols ● to compute an address ● put it in the desired encoding ● write it somewhere in a relocatable object ● checks if relocation was successful (no overflow)
9
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 9 CHECK (not used) CHECK (not used) COMPUTE ENCODE Relocations ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$”
10
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 10 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations
11
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 11 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations
12
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 12 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations
13
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 13 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations
14
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 14 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations
15
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 15 ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” Relocations MARKERS
16
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 16 TO FROM Relocations ● Example ● for an instruction “load immediate pc relative” that – loads the address of (an offset in) a relocatable object – minus the address of a symbol – plus some value (addend) – and stores this address pc relative (instruction encoding) – but automatically increases with the pc when executed – “R00 S00 – A00 +” “\\” “P- l*w” “\\” “s0000” “$” section a symbol 42 offset in a section c offset in c section b offset in b
17
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 17 bad_pointer: code = S00A00+... Relocation subtleties #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; } v1 code = R00.text 0 0 pointer: code = S00 A00+... 0 0 1 1 v2 code = R00.text 20
18
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 18 Relocation subtleties.text code = R00.text 0 0 bad_pointer: code = S00A00+... pointer: code = S00 A00+... 0 0 21 #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; }
19
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 19 Relocation subtleties.text code = R00.text 0x0 bad_pointer: code = S00A00+... pointer: code = S00 A00+... 0 0 21 #include void v1() { printf("v1\n"); } void v2() { printf("v2\n"); } void (*pointer)() = v1; int *bad_pointer = ((int *) v2) + 1; int main(int argc, char ** argv) { pointer(); ((void (*)()) (bad_pointer - 1))(); return 0; } RELOCATIONS & SYMBOLS SHOULD NOT BE RELAXED
20
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 20 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
21
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 21 Object file b Object files Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$
22
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 22 Object file b Graph representation Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$
23
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 23 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
24
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 24 EXE Reading information Entry MAP
25
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 25 EXE Reading information Entry MAP.text 0x8048120 0x5e4d40.text 0x8048120 0x24 /usr/lib/crt1.o 0x8048120 _start.text 0x8048144 0x22 /usr/lib/crti.o *fill* 0x8048166.text 0x8048170 0xc4 /usr/lib/crtbeginT.o *fill* 0x8048234.text 0x8048240 0x56f app_procs.o 0x8048300 app_run 0x80482a0 app_abort 0x80482e0 app_exit 0x8048240 app_libs_init *fill* 0x80487af.text 0x80487b0 0xf33 main.o 0x80487b0 main
26
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 26 Object file b EXE Object file a Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$ MAP
27
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 27 Object file b EXE Object file a Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$ MAP
28
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 28 Reading information Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$
29
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 29 Symbol Resolution Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code Symbol a: order 0 to undefined R00$ Symbol a: order 0 to undefined R00$
30
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 30 Symbol Resolution Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from code to sym aptr S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from data to sym a S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Relocation: from code to sym array S00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code
31
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 31 Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Entry Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol aptr: order 10 to data R00$ Symbol aptr: order 10 to data R00$ Symbol array: order 10 to data R00$ Symbol array: order 10 to data R00$ Code
32
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 32 Code Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Entry Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
33
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 33 Uses of the linkgraph ✔ Linking (placing sections, relocating) ✔ Apply linker optimizations (remove unused sections) ✗ fine-grained transformations ➔ Need for a more fine-grained graph: DCFRAG Direct Control Flow and Relocatable Addresses
34
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 34 Code Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Linkgraph Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Entry Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
35
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 35 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Disassembler Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
36
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 36 Disassembler ● Disassemble – Architecture independent analyses and optimizations work on architecture independent part or use callbacks Architecture independent - instruction type (jump, cmp,...) - conditional? - register lists (used, defined) Architecture independent - instruction type (jump, cmp,...) - conditional? - register lists (used, defined) Instruction Architecture dependent - backend decides - opcode - regs Architecture dependent - backend decides - opcode - regs
37
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 37 Detect basic blocks ● Disassemble ● Split sections into basic blocks Linked list of instructions Type (for special block) Linked list of instructions Type (for special block) Basic Block
38
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 38 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Detect basic blocks Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
39
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 39 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Detect basic blocks Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
40
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 40 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
41
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 41 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
42
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 42 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
43
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 43 Detect basic blocks Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
44
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 44 Add edges ● Disassemble ● Split sections into basic blocks – Targets direct jumps + successors conditional jumps – To's of relocs – analyze switches (computed) to find switch targets ● Add direct control flow edges and switch edges Connector for two basic blocks Type (jump, ft, call, return,...) Connector for two basic blocks Type (jump, ft, call, return,...) Edge
45
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 45 Add edges Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
46
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 46 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: direct jumps Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
47
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 47 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: fall-through paths Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
48
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 48 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Add edges: system calls Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Sys Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
49
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 49 Partition the code into functions - Name - list of basic blocks - register lists (used, defined,...) - Name - list of basic blocks - register lists (used, defined,...) Function
50
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 50 Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$ Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Partition the code into functions Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Sys Bbl Entry Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data
51
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 51 Function a Partition the code into functions Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$
52
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 52 Function callee Function caller Function Calls Bbl Return Bbl Return
53
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 53 Function callee Function caller Interprocedural Goto's Bbl Return Bbl Return
54
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 54 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Symbol b: order 10 to code R00$ Symbol b: order 10 to code R00$ Symbol _start: order 10 to code R00$ Symbol _start: order 10 to code R00$ Data Symbol a: order 10 to code R00$ Symbol a: order 10 to code R00$
55
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 55 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Data
56
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 56 Uses of the DCFRAG ✔ Fine grained removal of unused code and data ✗ Dataflow ➔ We need a graph for data flow analyses (ICFG)
57
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 57 Function a DCFRAG Function _start Function syscall hell Function program_entry Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Bbl Sys Bbl Entry Data Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Bbl Return Bbl Data
58
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 58 Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Function a Augmented Whole-Program CFG Function _start Function syscall hell Function program_entry Function hell Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return Data
59
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 59 Function a ICFG Function _start Function syscall hell Function program_entry Function hell Function b Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return
60
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 60 Uses of the ICFG ✔ Good for analysis and transformations ✗ Bad for writing out the program ➔ Use the combined ICFG + DCFRAG = AWPCFG
61
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 61 Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Function a AWPCFG Function _start Function syscall hell Function program_entry Function hell Function b Relocation: from code to data R00\l*w\s000$ Relocation: from code to data R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Relocation: from data to code R00\l*w\s000$ Bbl Return Hell Bbl Sys Bbl Entry Bbl Return Bbl Return Data
62
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 62 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
63
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 63 Concrete Data Structures t_relocatable t_symbol t_reloc_ref t_symbol_table t_reloc t_reloc_table t_ins t_bbl t_section t_cfg t_object t_arm_ins t_i386_ins t_symbol_ref t_cfg_edge t_ppc_ins t_function
64
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 64 Accessing fields ● With getters and setters ● t_bbl * head = CFG_EDGE_HEAD(cfg_edge) ● ARM_INS_SET_REGA(arm_ins, reg) ● Reason: Dynamic Members
65
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 65 Iterating the ICFG ● t_cfg * cfg; t_function * fun; t_bbl * bbl; t_ins * ins; t_cfg_edge * edge; ● CFG_FOREACH_FUNCTION(cfg, fun) – FUNCTION_FOREACH_BBL(fun, bbl) ● CFG_FOREACH_BBL(cfg, bbl) ● BBL_FOREACH_INS(bbl, ins) ● BBL_FOREACH_SUCC_EDGE(bbl, edge) ● BBL_FOREACH_PRED_EDGE(bbl, edge)
66
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 66 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
67
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 67 Primitive Transformations ● Relocations – Add: RelocTableAddRelocToRelocatable – Remove: RelocTableRemoveReloc – Modify: RelocSetFrom, RelocSetToRelocatable ● Sections – Create: SectionCreateForObject ● Functions – Create: FunctionMake – Remove: FunctionKill
68
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 68 Primitive Transformations ● Basic blocks – Create: BblNew – Remove: BblKill – Duplicate: BblDup – Split: BblSplitBlock ● ICFG edges – Create: CfgEdgeCreate – Remove: CfgEdgeKill ● Instructions – Create: InsNewForBbl – Remove: InsKill
69
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 69 On the DCFRAG ● SECTION_REFED_BY(sec) ● SECTION_REFERS_TO(sec) ● BBL_REFED_BY(bbl) ● BBL_REFED_BY_SYM(bbl) ● INS_REFERS_TO(ins)
70
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 70 Consistency of the AWPCFG ● Manipulate one view, what happens on other? ● Diablo tries to keep things consistent – Kill reloc, and its ICFG-edge is also killed – Kill ins, and to-relocs are also killed – Remove a section, and all to-relocs are also killed ● Makes sure that you do the proper thing – Try to kill an object with refers_to relocs, and it fatals
71
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 71 Part 2: Diablo Data Structures ● Goals ● Linker data structures ● Internal representation ● Construction of graphs ● Concrete Data Structures ● Manipulation ● Dynamic Members
72
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 72 Dynamic Members ● Diablo needs to be extensible ● Many analyses compute different kinds of information on very large data sets ● We cannot include all of them in the main libraries, or even store all information together ● Dynamic members augment basic data structures with members that can be allocated on the fly
73
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 73 Dynamic Members ● Example: member for reachability – t_bool BBL_REACHABLE(t_bbl) – BBL_SET_REACHABLE(t_bbl, t_bool) – BblInitReachable(t_cfg *) ● Allocates space for this field and calls init callback for each bbl – BblFiniReachable(t_cfg *) ● Calls fini callback and deallocates space
74
PLDI 06 Tutorial - Binary Rewriting with Diablo - part 2 74 Dynamic Members To instantiate the member: DYNAMIC_MEMBER( bbl,/* data structure to extend */ t_cfg *,/* manager type */ bbl_reachable_array, /* array to hold members */ t_bool, /* the type of the member */ reachable, /* lowercase name */ REACHABLE, /* UPPERCASE name */ Reachable, /* CamelCase name*/ CFG_FOREACH_BBL,/* iterator */ BblReachableInitCb, /* init callback*/ BblReachableFiniCb, /* fini callback */ BblReachableDupCb, /* dup callback */ );
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.