Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hamid Noori*, Farhad Mehdipour†, Norifumi Yoshimastu‡,

Similar presentations


Presentation on theme: "Hamid Noori*, Farhad Mehdipour†, Norifumi Yoshimastu‡,"— Presentation transcript:

1 A Reconfigurable Functional Unit for an Adaptive Dynamic Extensible Processor
Hamid Noori*, Farhad Mehdipour†, Norifumi Yoshimastu‡, Kazuaki Murakami*, Koji Inoue* and Morteza Saheb Zamani† *Department of Informatics, Kyushu Univ., Japan ‡Fukuoka Laboratory for Emerging & Enabling Technology of SoC, Japan †Computer Engineering and Information Technology Department, Amirkabir Univ. of Technology, Iran Operation Modes General Overview of the architecture Normal mode Profiling (optional) Executing Custom Instructions on the RFU and other parts of the code on the base processor Training mode Profiling Detecting start address of Hot Basic Blocks (HBBs) Generating Custom Instructions Generating Configuration Data for the RFU Binary rewriting Initializing the Sequencer Table ♦ Online Needs a simple hardware for profiling All tasks are run on the base processor ♦ Offline Needs a PC trace after taken branches/jumps Adaptive Dynamic Extensible Processor Base Processor Reg File Fetch Decode Execute Memory Write Augmented Hardware RFU Profiler Sequencer N-way in-order general RISC Detects start addresses of Hot Basic Blocks (HBBs) Executes Custom Instructions Switches between main processor and RFU Training Mode Training Mode Normal Mode Running Tools for Generating Custom Instructions, Generating Configuration Data for ACC and Initializing Sequencer Table Monitors PC and Switches between main processor and ACC Detecting Start Address of HBBs Applications Applications Applications Binary-Level Profiling Processor Profiler Profiler Processor Profiler Processor Profiler RFU Sequencer RFU Sequencer RFU Sequencer Binary Rewriting Executing CIs Tool Chain Custom instructions 1- Exclude floating point, multiply, divide and load instructions 2- Include at most one STORE, at most one BRANCH/JUMP and all other fixed point instructions Generating Custom instructions Finding the biggest sequence of instructions in the HBB that can be executed on the ACC Moving the instructions and appending supportable instructions to the head of the detected instruction sequence after checking flow-dependency and anti-dependency Moving the instructions and appending supportable instructions to the tail of the detected instruction sequence after checking flow-dependency and anti-dependency Rewriting object code if instructions have been moved Moving instructions, should not modify the logic of the application Custom instruction generation is done without considering any other constraints. 4052c0 addiu $29,$29,-32 4052c8 mov.d $f0,$f12 4052d0 sw $18,24($29) 4052d8 addu $18,$0,$6 4052e0 sw $31,28($29) 4052e8 sw $16,16($29) 4052f0 mfc1 $16,$f0 4052f8 mfc1 $17,$f1 srl $6,$17,0x14 andi $6,$6,2047 sltiu $2,$6,2047 addu $6,$6,$18 sltiu $2,$6,2047 lui $2,32783 and $17,$17,$2 andi $2,$6,2047 sll $2,$2,0x14 or $17,$17,$2 mtc1 $16,$f0 mtc1 $17,$f1 lw $31,28($29) lw $16,16($29) addiu $29,$29,32 jr $31 Speedup RFU Architecture : Functional Unit : Base connection : Optimized connection Input from register file DEC/EXE Pipeline Registers FU1 FU2 FU3 FU4 ACC Reg0 ……………………………………………………………… . Reg31 Sequencer EXE/MEM Pipeline Registers Config Mem Decoder RFU Integrating RFU with the Base Processor Output to register file


Download ppt "Hamid Noori*, Farhad Mehdipour†, Norifumi Yoshimastu‡,"

Similar presentations


Ads by Google