Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie.

Similar presentations


Presentation on theme: "Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie."— Presentation transcript:

1 Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie Published: Proceedings of the 2006 LCTES Conference SESSION: Low power issues PRESENTED by SALEEL KUDCHADKER

2 Processor Power Power is now a primary architectural concern Power is now a primary architectural concern Processor power consumption doubles w/ Pentium generations Processor power consumption doubles w/ Pentium generations High Power Consumption High Power Consumption Increases packaging/cooling cost Increases packaging/cooling cost Limits achievable performance Limits achievable performance Important for handheld embedded devices Important for handheld embedded devices Battery life Battery life Weight Weight Managing the Impact of Increasing… Cost of Removing heat from a microprocessor Increasing power consumption Intel website

3 Power Density Power Density = power /area Power Density = power /area Silicon is a bad heat Conductor Silicon is a bad heat Conductor Areas with high power density becomes hot Areas with high power density becomes hot Increased leakage current in transistors when heat increases Increased leakage current in transistors when heat increases Important to distribute power over the die Important to distribute power over the die Heat Stroke - Have to stop if any part of die has more than critical temperature Heat Stroke - Have to stop if any part of die has more than critical temperature

4 Register File Power Register File is a significant source of power dissipation Register File is a significant source of power dissipation Motorola M.CORE – approx. 16% processor power Motorola M.CORE – approx. 16% processor power RF may consume up to 25% of processor power RF may consume up to 25% of processor power High Register File Power density High Register File Power density Small size, causes Hotspots Small size, causes Hotspots e.g., Alpha 21264, Intel Pentium e.g., Alpha 21264, Intel Pentium Trend: increasing RF power due to Trend: increasing RF power due to Microarchitectural enhancements to improve IPC Microarchitectural enhancements to improve IPC Compiler techniques to improve IPC Compiler techniques to improve IPC Large Register Files (esp. VLIW processors) Large Register Files (esp. VLIW processors)

5 Reducing RF Power: Related Work Three ways to reduce RF Power 1. Reduce energy per access to RF 2. Reduce number of registers in RF 3. Reduce number of accesses to RF

6 “On-Demand RF Read ” Existing processors anticipatorily read RF Existing processors anticipatorily read RF e.g., Pentium 4, Alpha 21264 e.g., Pentium 4, Alpha 21264 SpecInt95 running on MIPS II SpecInt95 running on MIPS II 36% operands come from bypasses 36% operands come from bypasses 8-issue SimpleScalar running SpecInt2K 8-issue SimpleScalar running SpecInt2K 50-70% operands come from bypasses 50-70% operands come from bypasses Read from RF only if necessary Read from RF only if necessary First find out if the value is present in the bypasses First find out if the value is present in the bypasses If not, then read the value from RF If not, then read the value from RF We’ll call this “On-Demand RF Read” We’ll call this “On-Demand RF Read” When applied to Intel XScale model When applied to Intel XScale model 58% energy reduction 58% energy reduction < 3% performance loss < 3% performance loss

7 Processor Model Pipeline Bypasses Pipeline Bypasses Improve performance Improve performance Full bypassing Full bypassing Best performance, but high power, area & wiring complexity Best performance, but high power, area & wiring complexity Partial Bypassing Partial Bypassing Keep only some bypasses Keep only some bypasses Popular in embedded processors, e.g., Intel XScale Popular in embedded processors, e.g., Intel XScale

8 Bypass-sensitive RF Power-Aware Scheduling Schedule instructions so that Schedule instructions so that Dependent instruction transfer operands using bypasses Dependent instruction transfer operands using bypasses Reduce RF usage Reduce RF usage Compiler needs to know Compiler needs to know When does an instruction bypass result? When does an instruction bypass result? Which operands can read the result? Which operands can read the result? When result is written into register file? When result is written into register file? A BYPASS AWARE COMPILER IS NEEDED!! A BYPASS AWARE COMPILER IS NEEDED!! Add R1 R2 R3 ADD R10 R11 R12 SUB R4 R5 R1 NO BYPASS!! Add R1 R2 R3 SUB R4 R5 R1 ADD R10 R11 R12 BYPASS POSSIBLE!!

9 OT-based RF Power-Aware Scheduling Operation Tables (OTs) provide a mechanism Operation Tables (OTs) provide a mechanism To accurately estimate the number of operands read from RF To accurately estimate the number of operands read from RF Exploit OTs for scheduling to reduce RF usage Exploit OTs for scheduling to reduce RF usage Various scheduling strategies can be employed Various scheduling strategies can be employed Choose scheduling heuristic with the least RF usage Choose scheduling heuristic with the least RF usage 3 BB scheduling techniques 1. RFPEX: Exhaustive 2. RFPN: Greedy 3. RFPN2: Greedy with one level of backtracking

10 Experimental Setup Intel XScale Intel XScale 7 –stage, partially bypassed 7 –stage, partially bypassed On-Demand RF Read Architecture On-Demand RF Read Architecture RF Power Model RF Power Model = # Register File Accesses MiBench benchmarks MiBench benchmarks Scheduler Scheduler Operation Table - based Operation Table - based RF Power-Aware Scheduling RF Power-Aware Scheduling Within Basic Block Within Basic Block Tried 3 strategies Tried 3 strategies RF Power Results RF Power Results Compare with On-Demand RF Read Compare with On-Demand RF Read GCC –O3 Assembly Executable Runtime RF Reads OT – based Scheduler Application GCC linker

11 1. RFPEX Scheduling Exhaustive Exhaustive Try all legal permutations of instructions Try all legal permutations of instructions Compilation Time Compilation Time Hours Hours Could not schedule susan, rijndael (2 days) Could not schedule susan, rijndael (2 days) RF Power Reduction RF Power Reduction Average 12% Average 12% Performance Improvement Performance Improvement Average 1.4% Average 1.4% 2. RFPN Scheduling Greedy Greedy Pick instructions one by one Pick instruction which gets most operands from bypass Compilation time Compilation time Seconds Seconds RF Power Reduction RF Power Reduction Average 6% Average 6% Performance Improvement Performance Improvement Average: -3.5% Average: -3.5% 3. RFPN2 Scheduling Greedy with OP table comparison Greedy with OP table comparison Compilation time Compilation time Minutes Minutes RF Power Reduction RF Power Reduction Average 10.5% Average 10.5% Performance Improvement Performance Improvement Average: -2% Average: -2% 2. RFPN Scheduling 3. RFPN2 Scheduling 1. RFPEX Scheduling 2. RFPN Scheduling 3. RFPN2 Scheduling

12 Summary Register File is one of the main hotspots in processors Very important to reduce RF Power Repeated accesses cause “Heat Stroke” Up to 90% performance degradation On-Demand RF Read is an effective technique 58% RF power reduction Scope for further RF power reduction via instruction scheduling Contribution: Instruction Scheduling Technique for further RF power reduction Up to 26%, Average 12% RF power reduction 2% performance degradation Over and above On-Demand RF Read architecture RFPN2 is an effective heuristic for RF Power reduction Future Work Beyond basic block scheduling

13 Our Project Our class project features on reducing the power consumption using Power Aware Instruction Scheduling or Value Life time characteristics of the register Our class project features on reducing the power consumption using Power Aware Instruction Scheduling or Value Life time characteristics of the register Paper with Value lifetime characteristic will be presented by Pradyanesh.


Download ppt "Bypass Aware Instruction Scheduling for Register File Power Reduction Sanghyun Park, Aviral Shrivastava Nikil Dutt, Alex Nicolau Yunheung Paek Eugene Earlie."

Similar presentations


Ads by Google