Presentation is loading. Please wait.

Presentation is loading. Please wait.

Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division.

Similar presentations


Presentation on theme: "Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division."— Presentation transcript:

1 Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division of Engineering Richard Weiss Hampshire College School of Cognitive Science BROWN UNIVERSITY

2 BARC January 30, 2003 Motivation Performance drives high-end processor design  Include many complex architectural features  Resources may not always be optimally utilized Resources dissipate some power regardless of utilization  Dynamic schemes allow processor to reconfigure resources according to program’s needs  Some means of monitoring program is needed to drive reconfiguration

3 BROWN UNIVERSITY BARC January 30, 2003 Monitoring Options Hardware monitoring Relatively easy to implement Can easily adjust to changing patterns  Must first recognize pattern before reacting  Restricted to fixed-sized sampling windows Software profiling Reconfiguration occurs in anticipation of changing needs Sampling ranges are adaptable  Requires instruction annotation and initial sampling overhead  Only applicable to instructions with very deterministic behavior

4 BROWN UNIVERSITY BARC January 30, 2003 Why Not Combine? Each has its particular benefits If hardware and software techniques can be combined, can we improve the control policies driving processor reconfiguration? Potentially lead to better energy savings and higher overall performance.

5 BROWN UNIVERSITY BARC January 30, 2003 Our Goal Have HW and SW profiling work together to better identify program behavior  Allow processor to react more quickly to strongly deterministic behavior  Allow HW monitoring to assist with hard-to-predict cases with hints from software profiling

6 BROWN UNIVERSITY BARC January 30, 2003 Low Power Configurations We consider 2 different configurations separately:  Reducing issue width and ALUs Save power in issue queue arbitration logic Save power from underutilized ALUs  Fetch Halting Triggered by a critical load missing to main memory Fetching is disabled for the duration of the miss Reduces occupancy rates in fetch and issue queues Reduces number of wrong path instructions fetched

7 BROWN UNIVERSITY BARC January 30, 2003 Pipeline Organization Annotation Decoder Annotation Decoder Branch Predictor Branch Predictor Fetch Unit Instruction Cache Instruction Cache Instruction Decoder Instruction Decoder Instruction Scheduler Instruction Scheduler Register File Integer ALU Cluster 1 Integer ALU Cluster 1 Integer ALU Cluster 2 Integer ALU Cluster 2 Floating Point ALU Cluster 2 Floating Point ALU Cluster 2 Floating Point ALU Cluster 1 Floating Point ALU Cluster 1 Load/Store Unit Data Cache Data Cache Low-Power State Logic Low-Power State Logic Disable Fetch Unit Disable auxiliary ALU cluster and reduce issue width

8 BROWN UNIVERSITY BARC January 30, 2003 Adjusting Issue Width Adjust issue width between 8 and 4 and disable second integer ALU cluster SW approach profiles IPC from train dataset  Annotates blocks with low IPC  Decoding start of block triggers entry to LP mode HW approach using built-in counters to monitor IPC  Use fixed 256 cycle window  If integer IPC < threshold, enter LP mode Combined approach  SW steers blocks with consistent behavior  HW handles remaining blocks

9 BROWN UNIVERSITY BARC January 30, 2003 Results for Reduced Issue Width SW and HW results are comparable COMBined results show that SW + HW methods identify different opportunities for saving power

10 BROWN UNIVERSITY BARC January 30, 2003 Results for Reduced Issue Width SW performance is more consistent because thresholds can be tuned on a per-application basis

11 BROWN UNIVERSITY BARC January 30, 2003 Fetch Halting Requires a combination of SW and HW monitoring: SW profiling:  Identify critical loads that miss to main memory  IPC, occupancy rates, dead cycles, “miss stride” HW monitoring:  Using annotations from SW profiling, HW tracks miss behavior only for “promising” load instructions.  Miss stride from annotations is compared to miss counter in HW to capture dynamic miss behavior For now we simulate a perfect miss-predictor

12 BROWN UNIVERSITY BARC January 30, 2003 Fetch Halting Potential Memory access rates shows that the fetch halting potential for each benchmark varies Bench- mark % DL1 miss % L2 miss % mem access mgrid3.9%22.8%0.9% vpr4.5%24.7%1.1% gcc0.5%12.8%0.1% mcf23.8%48.0%11.4% twolf6.4%20.1%1.3%

13 BROWN UNIVERSITY BARC January 30, 2003 Results for fetch halting Restricting fetch halting based on criticality information benefits performance

14 BROWN UNIVERSITY BARC January 30, 2003 Fetch Halting and RUU Occupancy Perfect + crit results in average 10% RUU occupancy drop

15 BROWN UNIVERSITY BARC January 30, 2003 Conclusions and Future Work HW and SW predict different low power events and can be combined offering greater power saving potential. Future work:  Improve HW/SW combination scheme  Improve criticality predictor  Currently working on HW miss predictor  Adjust the halt period


Download ppt "Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division."

Similar presentations


Ads by Google