Download presentation
Presentation is loading. Please wait.
Published byKevin Harrison Modified over 8 years ago
1
ISAMAP: Instruction Mapping Driven by Dynamic Binary Translation Maxwell Souza, Daniel Nicácio and Guido Araujo
2
Motivation Architecture diversity is increasing There is a need for legacy code to use new architecture features Code portability between architectures is also desirable Dynamic Binary Translation (DBT) enables it
3
ArchC Processor description language –SystemC compatible –8 researchers for the last 5 years Features: –Fast interpreted/compiling simulation –Linux OS syscall emulation –Runs code directly from GCC (allows gdb support) Processors: –MIPS, SPARC, PPC, 8051, ARM, OR10K, etc. –Runs Mediabench, Mibench and SPEC CInt –Simulation speed: from 100 KIPS to 570 MIPS
4
Instruction Set Architecture (AC_ISA) AC_ISA(mips1) { ac_format Type_R = "%op:6 %rs:5 %rt:5 %rd:5 0x00:5 %func:6"; ac_format Type_I = "%op:6 %rs:5 %rt:5 %imm:16"; ac_instr add; ac_instr load; ISA_CTOR(mips1) { add.set_asm("add %reg, %reg, %reg“, rd,rs,rt); add.set_decoder(op=0x00, func=0x20); load.set_asm("lw %reg, %imm(%reg)“, rt,imm,rs); load.set_decoder(op=0x23); }; Binary field Instruction declaration Decoding order
5
Architecture Resources (AC_ARCH) AC_ARCH(mips1) { ac_mem MEM:256K; ac_regbank RB:32; ac_reg lo,hi; ac_pipe PIPE = {IF,ID,EX,MEM,WB}; ac_format Fmt_EX_MEM = "%alures:32 %wdata:32 %rdest:5 %regwrite:1 %memread:1 %memwrite:1"; ac_reg EX_MEM; ac_wordsize 32; ARCH_CTOR(mips1) { ac_isa("mips1_isa.ac");.. };
6
Instruction Behavior (ac_behavior) void ac_behavior( Type_R, int stage ){ switch(stage){ case IF: case ID: /* Checking forwarding for the rs register */ if ( (EX_MEM.regwrite == 1) && (EX_MEM.rdest != 0) && (EX_MEM.rdest == ID_EX.rs) ) operand1 = EX_MEM.alures.read(); else if( (MEM_WB.regwrite == 1) && (MEM_WB.rdest != 0) && (MEM_WB.rdest == ID_EX.rs) ) operand1 = MEM_WB.wbdata.read(); else operand1 = RB.read(rs);... default: break; }
7
Jump and Branches Semantics Additional information –jump() : target computation –delay() : conditional call.set_decoder(op=0x01); call.jump(ac_pc+(disp30<<2)); call.delay(1, true); call.behavior(writeReg(15, ac_pc)); be.set_decoder(op=0x00, cond=0x01, op2=0x02); be.branch(ac_pc+(disp22<<2)); be.cond(PSR_icc_z); be.delay(1, PSR_icc_z || !an);
8
ArchC Overview ArchC Description ArchC Pre- processor(acpp) ArchC IR Simulator Generator Back-end Generator Linker Generator Assembler Generator ISAMAP
9
Instruction Mapping Description Driven by DBT Descriptions use ArchC language ISA models Source architecture ISA Target architecture ISA Mapping between source and target Low-level ISA mapping
10
Instruction Set Architecture (AC_ISA) ISA(powerpc) { isa_format XO1 = "%opcd:6 %rt:5 %ra:5 %rb:5 %oe:1 %xos:9 %rc:1”; isa_instr add, subf; isa_regbank r:32 = [0..31]; ISA_CTOR(powerpc) { add.set_asm(”add %reg %reg %reg", rt, ra, rb); add.set_decoder(opcd=31, oe=0, xos=266, rc=0); subf.set_asm(”subf %reg %reg %reg", rt, ra, rb); subf.set_decoder(opcd=31, oe=0, xos=40, rc=0); }
11
Instruction Set Architecture (AC_ISA) ISA(x86) { isa_format op1b_r32 = "%op1b:8 %mod:2 %regop:3 %rm:3"; isa_instr add_r32_r32, mov_r32_r32; isa_reg eax = 0; isa_reg ecx = 1;... isa_reg edi = 7; ISA_CTOR(x86) { add_r32_r32.set_operands(”add %reg %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); mov_r32_r32.set_operands(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3);
12
ISA Mapping isamap_instrs { add %reg %reg %reg; subf %reg %reg %reg; $0 $1 $2 $0 $1 $2} = { mov_r32_r32 edi $1; mov_r32_r32 edi $2; add_r32_r32 edi $2; sub_r32_r32 edi $1; mov_r32_r32 $0 edi; mov_r32_r32 $0 edi;}; (add)(subf)
13
ISAMAP Flow acpp Source ISA Target ISA ISA Mapping DBT Source Compiler ArchC Host Code DBT Libraries ISAMAP
14
Overall ISAMAP Structure Standard DBT implementation –16MB Code Cache –Block linkage (at first touch) –No traces –Syscall mapping In addition it provides mapping support –Instruction semantics (load, store, branch, fp) –Register read/write status –Conditional mapping
15
Register Read Semantics Avoids unnecessary register reads/writes add_r32_r32.set_asm (”add %reg, %reg", rm, regop); add_r32_r32.set_encoder(op1b=0x01, mod=0x3); add_r32_r32.set_read(regop); mov_r32_r32.set_asm(”mov %reg %reg", rm, regop); mov_r32_r32.set_encoder(op1b=0x89, mod=0x3); mov_r32_r32.set_write(rm);
16
Conditional Mappings isamap_instrs { or %reg %reg %reg; } = { if ($1 = $2) { mov_r32_m32disp edi $1; mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; or_r32_m32disp edi $2; mov_m32disp_r32 $0 edi; }
17
Conditional Mapping (cont.) isamap_instrs { rlwinm %reg %reg %imm %imm %imm; } = { if($2 = 0) { mov_r32_m32disp edi $1; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; } else { mov_r32_m32disp edi $1; rol_r32_imm8 edi $2; and_r32_imm32 edi mask32($3, $4); mov_m32disp_r32 $0 edi; };
18
Mapping PPC Instruction cmp Which Whx CR = ov Which group out of 8? 1 2 3 4 5 6 7 8 4 bits
19
Mapping PPC Instruction cmp (cont.) Careful analysis pays off….
20
At the end: Optimization Steps Local register allocation Copy-propagation Dead-code ellimination
21
Optimization Results
22
ISAMAP vs. QEMU (Int) Speed-ups ranging from 1.12 to 3.01
23
ISAMAP vs. QEMU (FP) Not fair, as QEMU was not using SSE
24
ISAMAP Good Side Allows for a fast implementation Isolates the translator issues from mapping Let the focus be on the mapping Can reuse simulator descriptions
25
ISAMAP Bad Side Does not allow high-level C descriptions Still needs to go through asm details But on the other hand…. –1 PhD in one year for the tool –4-6 months for both descriptions and the mapping (no previous experience)
26
Related Work Dynamo ADORE Aries Digital FX!32 UQDBT Yirr-Ma DAISY QEMU IA-32 EL
27
Future Work Additional issues –Self-modifying code –Cover more SPEC programs Measure mapping vs. tool speedup contribution Evaluate the translation overhead –From C to x86 –From C to PPC to x86 Mappings to embedded engines
28
The End Work supported by FAPESP and CNPq Thanks for the feedback !!
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.