Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo.

Similar presentations


Presentation on theme: "ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo."— Presentation transcript:

1 ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo

2 Motivation Architecture diversity is increasing There is a need for legacy code to use new architecture features Code portability between architectures is also desirable Dynamic Binary Translation (DBT) enables it

3 ArchC Processor description language –SystemC compatible –8 researchers for the last 5 years Features: –Fast interpreted/compiling simulation –Linux OS syscall emulation –Runs code directly from GCC (allows gdb support) Processors: –MIPS, SPARC, PPC, 8051, ARM, OR10K, etc. –Runs Mediabench, Mibench and SPEC CInt –Simulation speed: from 100 KIPS to 570 MIPS

4 Instruction Set Architecture (AC_ISA) AC_ISA(mips1) { ac_format Type_R = "%op:6 %rs:5 %rt:5 %rd:5 0x00:5 %func:6"; ac_format Type_I = "%op:6 %rs:5 %rt:5 %imm:16"; ac_instr add; ac_instr load; ISA_CTOR(mips1) { add.set_asm("add %reg, %reg, %reg“, rd,rs,rt); add.set_decoder(op=0x00, func=0x20); load.set_asm("lw %reg, %imm(%reg)“, rt,imm,rs); load.set_decoder(op=0x23); }; Binary field Instruction declaration Decoding order

5 Architecture Resources (AC_ARCH) AC_ARCH(mips1) { ac_mem MEM:256K; ac_regbank RB:32; ac_reg lo,hi; ac_pipe PIPE = {IF,ID,EX,MEM,WB}; ac_format Fmt_EX_MEM = "%alures:32 %wdata:32 %rdest:5 %regwrite:1 %memread:1 %memwrite:1"; ac_reg EX_MEM; ac_wordsize 32; ARCH_CTOR(mips1) { ac_isa("mips1_isa.ac");.. };

6 Instruction Behavior (ac_behavior) void ac_behavior( Type_R, int stage ){ switch(stage){ case IF: case ID: /* Checking forwarding for the rs register */ if ( (EX_MEM.regwrite == 1) && (EX_MEM.rdest != 0) && (EX_MEM.rdest == ID_EX.rs) ) operand1 = EX_MEM.alures.read(); else if( (MEM_WB.regwrite == 1) && (MEM_WB.rdest != 0) && (MEM_WB.rdest == ID_EX.rs) ) operand1 = MEM_WB.wbdata.read(); else operand1 = RB.read(rs);... default: break; }

7 Jump and Branches Semantics Additional information –jump() : target computation –delay() : conditional call.set_decoder(op=0x01); call.jump(ac_pc+(disp30<<2)); call.delay(1, true); call.behavior(writeReg(15, ac_pc)); be.set_decoder(op=0x00, cond=0x01, op2=0x02); be.branch(ac_pc+(disp22<<2)); be.cond(PSR_icc_z); be.delay(1, PSR_icc_z || !an);

8 ArchC Overview ArchC Description ArchC Pre- processor(acpp) ArchC IR Simulator Generator Back-end Generator Linker Generator Assembler Generator ISAMAP

9 Instruction Mapping Description Driven by DBT Descriptions use ArchC language ISA models  Source architecture ISA  Target architecture ISA  Mapping between source and target Low-level ISA mapping

10 Instruction Set Architecture (AC_ISA) ISA(powerpc) { isa_format XO1 = "%opcd:6 %rt:5 %ra:5 %rb:5 %oe:1 %xos:9 %rc:1”; isa_instr add, subf; isa_regbank r:32 = [0..31]; ISA_CTOR(powerpc) { add.set_asm(”add %reg %reg %reg", rt, ra, rb); add.set_decoder(opcd=31, oe=0, xos=266, rc=0); subf.set_asm(”subf %reg %reg %reg", rt, ra, rb); subf.set_decoder(opcd=31, oe=0, xos=40, rc=0); }

11 Instruction Set Architecture (AC_ISA) ISA(x86) { isa_format op1b_r32 = "%op1b:8 %mod:2 %regop:3 %rm:3"; isa_instr add_r32_r32, mov_r32_r32; isa_reg eax = 0; isa_reg ecx = 1;... isa_reg edi = 7; ISA_CTOR(x86) { add_r32_r32.set_operands(”add %reg %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); mov_r32_r32.set_operands(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3);

12 ISA Mapping isamap_instrs { add %reg %reg %reg; subf %reg %reg %reg; $0 $1 $2 $0 $1 $2} = { mov_r32_r32 edi $1; mov_r32_r32 edi $2; add_r32_r32 edi $2; sub_r32_r32 edi $1; mov_r32_r32 $0 edi; mov_r32_r32 $0 edi;}; (add)(subf)

13 ISAMAP Flow acpp Source ISA Target ISA ISA Mapping DBT Source Compiler ArchC Host Code DBT Libraries ISAMAP

14 Overall ISAMAP Structure Standard DBT implementation –16MB Code Cache –Block linkage (at first touch) –No traces –Syscall mapping In addition it provides mapping support –Instruction semantics (load, store, branch, fp) –Register read/write status –Conditional mapping

15 Register Read Semantics Avoids unnecessary register reads/writes add_r32_r32.set_asm (”add %reg, %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); add_r32_r32.set_read(regop); mov_r32_r32.set_asm(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3); mov_r32_r32.set_write(rm);

16 Conditional Mappings isamap_instrs { or %reg %reg %reg; } = { if ($1 = $2) { mov_r32_m32disp edi $1; mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; or_r32_m32disp edi $2; mov_m32disp_r32 $0 edi; }

17 Conditional Mapping (cont.) isamap_instrs { rlwinm %reg %reg %imm %imm %imm; } = { if($2 = 0) { mov_r32_m32disp edi $1; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; rol_r32_imm8 edi $2; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; };

18 Mapping PPC Instruction cmp Which Whx CR = ov Which group out of 8? 1 2 3 4 5 6 7 8 4 bits

19 Mapping PPC Instruction cmp (cont.) Careful analysis pays off….

20 At the end: Optimization Steps Local register allocation Copy-propagation Dead-code ellimination

21 Optimization Results

22 ISAMAP vs. QEMU (Int) Speed-ups ranging from 1.12 to 3.01

23 ISAMAP vs. QEMU (FP) Not fair, as QEMU was not using SSE

24 ISAMAP Good Side Allows for a fast implementation Isolates the translator issues from mapping Let the focus be on the mapping Can reuse simulator descriptions

25 ISAMAP Bad Side Does not allow high-level C descriptions Still needs to go through asm details But on the other hand…. –1 PhD in one year for the tool –4-6 months for both descriptions and the mapping (no previous experience)

26 Related Work Dynamo ADORE Aries Digital FX!32 UQDBT Yirr-Ma DAISY QEMU IA-32 EL

27 Future Work Additional issues –Self-modifying code –Cover more SPEC programs Measure mapping vs. tool speedup contribution Evaluate the translation overhead –From C to x86 –From C to PPC to x86 Mappings to embedded engines

28 The End Work supported by FAPESP and CNPq Thanks for the feedback !!


Download ppt "ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo."

Similar presentations


Ads by Google