Presentation is loading. Please wait.

Presentation is loading. Please wait.

Canturk Isci Gilberto Contreras Margaret Martonosi

Similar presentations


Presentation on theme: "Canturk Isci Gilberto Contreras Margaret Martonosi"— Presentation transcript:

1 Canturk Isci Gilberto Contreras Margaret Martonosi
Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management Canturk Isci Gilberto Contreras Margaret Martonosi MICRO-39 Dec Orlando,FL PrincetonUniversity October 17, 2019

2 Workload Variability Enables Dynamic Power Management
Power critical across computing spectrum Dynamic Power Management (DPM): Tune system to varying application demand! Inter-Workload Variability Intra-Workload Variability 30 35 40 45 50 55 0.0 0.5 1.0 1.5 2.0 IPC 0.2 0.4 0.6 0.8 swim equake mgrid lucas applu fma3d wupwise mcf apsi art facerec gap gcc bzip2 vortex vpr galgel parser eon ammp perlbmk mesa twolf gzip sixtrack crafty Mem Refs Power [W] ENABLER Inter-workload variability Often repetitive intra-workload variability  Phase Behavior Our Work: How can we project varying (repetitive) application behavior to better guide DPM techniques?

3 Research Overview Live, Runtime Phase Monitoring Prediction
Phase Classification Runtime Monitoring Performance Counters Application and Prediction Phase Prediction with application to Dynamic Power Management Dynamic Management on Real Systems Real Measurements

4 Research Overview Application
Phase Classification Runtime Monitoring Performance Counters Application 1) Track memory accesses per instruction (MPI) via performance counters 2) Classify execution into phase patterns based on MPI rates 3) Predict future behavior with the Global Phase History Table (GPHT) predictor Phase Prediction 4) Use phase predictions to guide dynamic voltage and frequency scaling (DVFS) Dynamic Management Real Measurements

5 Dynamic Power Management with Live, Runtime Phase Prediction
Current (Reactive) dynamic adaptation approach: Assume last/recent observed behavior will persist Tracked Characteristic Great for stable execution! Inaccurate response for highly variable behavior! t Key questions: How can we accurately predict future application phase behavior on all types of execution? Can we use phase predictions to improve workload-adaptive power management?

6 Research Questions How can we accurately predict future application phase behavior on all types of execution? Specific phase definitions Phase prediction with the GPHT Prediction results Can we use phase predictions to improve workload-adaptive power management?

7 Guiding Dynamic Power Management
Example technique: Dynamic voltage and frequency scaling (DVFS) Memory Accesses Low High Overlapping CPU Execution Low High f: t CPU MEM ½ f: t Here we shift gears from our general purpose phase analiz for specific target CPU MEM Track Memory accesses per instruction (MPI) Different MPI rates  Different DVFS settings

8 Phase Definitions Assign different MPI ranges to different phases
Higher phase number  more memory bound phase MPI Phase # DVFS Setting < 0.005 1 (1500 MHz, 1484 mV) [0.005,0.010) 2 (1400 MHz, 1452 mV) [0.010,0.015) 3 (1200 MHz, 1356 mV) [0.015,0.020) 4 (1000 MHz, 1228 mV) [0.020,0.030) 5 ( 800 MHz, 1116 mV) > 0.030 6 ( 600 MHz, mV) [Based on Wu et al. Micro’05] Important phase properties Resilient to system variations Invariant to dynamic power management actions

9 Execution Phase Patterns
Applu execution snapshot: MPI Phases 0.020 0.015 MPI Rate 0.010 0.005 0.000 1 2 3 4 5 Phases Now going back to our first question, lets look at a real ex 2.80E+10 2.90E+10 3.00E+10 3.10E+10 3.20E+10 3.30E+10 Cycles Significant variations exist! Phase patterns expose repetition!

10 Predicting Phases with the Global Phase History Table (GPHT) Predictor
PHT Tags PHT Pred-n Age / Invalid GPHR Pt’ Pt’-1 Pt’-2 Pt’-N Pt’ Pt’-1 Pt’-2 Pt’-N Pt’+1 20 Pt-1 Pt-2 Pt-N Pt Pt-N-1 Pt’’ Pt’’ Pt’’-1 Pt’’-2 Pt’’-N Pt’’-1 Pt’’-2 Pt’’-N Pt’’+1 Pt’’+1 15 : : : : : : : PHT entries : GPHR depth Pt Pt : : : : : : : : P0 P0 P0 P0 P0 -1 Last observed phase from performance counters GPHR depth Predicted Phase From GPHR(0) if no matching pattern From the corresponding PHT Prediction entry if matching pattern in PHT Inspired by a global history branch predictor Software! Implemented in the OS for on-the-fly phase prediction

11 GPHT Prediction Accuracies
100 90 80 LastValue Prediction Accuracy (%) 70 PHT:1024, GPHR:8 FixWindow_8 60 VarWindow_128_0.005 50 40 gzip_log mcf_inp gcc_200 gap_ref gcc_scilab gcc_expr ammp_in gcc_166 apsi_ref mgrid_in applu_in parser_ref equake_in wupwise_ref gcc_integrate bzip2_program bzip2_source bzip2_graphic On the x-axis some of spec ordered Compare to reactive approaches Last Value / Fixed Window History / Variable Window History GPHT performs significantly better for highly varying applications Up to 6X and on average 2.4X misprediction improvement

12 Impact of PHT Size 128-entry PHT is plenty
100 90 80 LastValue Prediction Accuracy (%) 70 PHT:1024, GPHR:8 60 PHT:128, GPHR:8 PHT:64, GPHR:8 50 PHT:1, GPHR:8 40 gzip_log mcf_inp gcc_200 gap_ref gcc_scilab gcc_expr ammp_in gcc_166 apsi_ref parser_ref mgrid_in applu_in equake_in wupwise_ref gcc_integrate bzip2_program bzip2_source bzip2_graphic 128-entry PHT is plenty Converges to last value as PHT entries  1

13 Impact of Phase Granularities
Average accuracy over experimented applications: N=1  Both 100% NO(10,000)  Both  0% 6

14 Research Questions How can we accurately predict future application phase behavior on all types of execution? Can we use phase predictions to improve workload-adaptive power management? Real-System implementation Dynamic power management results

15 Real-System Implementation
Application Application Binary OS Predictor State Phase History Multimeter (DAQ) Parallel Port CPU (V,I) PMI Interrupt Handler Predict Next Phase Stop/Read Counters Check/Set DVFS State Hardware Restart Counters Performance Counters DVFS Registers Pentium-M Processor

16 Phase-Driven Dynamic Adaptation: Complete Example
MPI (GPHT) ACTUAL_PHASE PRED_PHASE (GPHT) 0.000 0.004 0.008 0.012 0.016 0.020 0.024 MPI GPHT can accurately predict varying application behavior! 1 2 3 4 5 Phases 2 4 6 8 10 12 14 Significant power savings compared to baseline! Baseline GPHT Power [W] 0.3 0.6 0.9 1.2 1.5 1.8 2.1 Baseline GPHT Small performance degradation! BIPS 1.5E+09 2.0E+09 2.5E+09 3.0E+09 3.5E+09 4.0E+09 4.5E+09 5.0E+09 Instructions

17 Improvement over Reactive Methods
0% 10% 20% 30% 40% 50% EDP Improvement Last Value GPHT 63% 66% 70% 7% EDP improvement over reactive methods! Comparable or less performance degradation! 0% 5% 10% 15% 20% bzip2_program bzip2_source bzip2_graphic mgrid_in applu_in equake_in swim_in mcf_inp average Perf. Degradation Last Value GPHT Power perf firs -> then edp Plots show EDP impr. And perf degr. For GPHT and last val, wrt baseline exec-n

18 Summary Phase characterizations help identify repetitive application behavior under real-system variability and dynamic management actions Runtime phase predictions with the Global Phase History Table can accurately predict future application behavior Up to 6X and on average 2.4X less mispredictions than reactive approaches Dynamic power management guided by these phase predictions help improve system power-performance efficiency 27% EDP improvements over baseline and 7% over reactive approaches

19 THANKS! Download: www.princeton.edu/~canturk/platypus/ GPHT LKM
Used kernel

20 Phase-Driven Management Vision
PC X A C S1 S3 S8 N V M S6 O Action to Controller Events PMCs Classifier History & State Table Phase State Machine I$ D$ Commit I$ Misses D$ Misses Instr-ns Completed DVS Cache Reconfig Phase State Next Phase

21 Design Constraints and Decisions
Target management technique Dynamic voltage and frequency scaling (DVFS) Experimental platform Pentium-M (Banias)  2 PMCs Instruction based monitoring Eliminate timing variations First PMC  Instructions retired DVFS potential: α Memory boundedness of application α (Available concurrent execution)-1 Second PMC  Memory accesses per instruction (MPI) DVFS invariance: Tracked features should not change with dynamic adaptations Here we shift gears from our general purpose phase analiz for specific target

22 Mispredicted Distance vs. Prediction Accuracy
Average distance between actual and predicted phase numbers over whole execution NOTE: Phases not uniform space though!!

23 Application Execution
Operation Flowchart Dynamic Adaptation Control: Stop/Read performance counters Every 100 million instructions Translate to phases Update phase predictor states Predict next phase Application Execution Translate to DVFS setting Same as current setting? No Apply new DVFS setting Yes Exit to program execution Clear interrupt Restart counters

24 Improvement over Reactive Methods
0% 10% 20% 30% 40% 50% 60% Power Savings Last Value GPHT 66% 76% Improved power savings! Comparable or less performance degradation! 20% Last Value GPHT 15% Power perf firs -> then edp Plots show EDP impr. And perf degr. For GPHT and last val, wrt baseline exec-n Perf. Degradation 10% 5% 0% mgrid_in applu_in equake_in swim_in mcf_inp average bzip2_program bzip2_source bzip2_graphic

25 Improvement over Reactive Methods
0% 10% 20% 30% 40% 50% EDP Improvement Last Value GPHT 63% 66% 70% Power perf firs -> then edp Plots show EDP impr. And perf degr. For GPHT and last val, wrt baseline exec-n mgrid_in applu_in equake_in swim_in mcf_inp average bzip2_program bzip2_source bzip2_graphic 27% EDP improvement over baseline! 7% EDP improvement over reactive methods!

26 Bounding Performance Degradation
Phase mappings dynamically configurable Can limit performance degradation sacrificing power efficiency Bakup EO Micro’06

27 GPHT Overhead Insignificant ~0.02%

28 DVFS Invariance Important constraint when talking “actions”
If actions change phase classifications: Obsolete past history & unreliable predictions

29 DVFS Invariance Need better explanation!!

30 Program Phases Distinct and often-recurring regions of program behavior How can we detect recurrent execution under real-system variability? How can we predict future phase patterns? How can we leverage predicted phase behavior for workload-adaptive power management? Can we do better than simple, reactive methods? Useful for: Characterizing execution regions Use current phase/behavior to predict future behavior Managing dynamic adaptation

31 Predicting Phases with the Global Phase History Table (GPHT) Predictor
PHT Tags PHT Pred-n Age / Invalid Pt’ Pt’ Pt’-1 Pt’-2 Pt’-N Pt’-1 Pt’-2 Pt’-N Pt’+1 15 20 : -1 GPHR Pt Pt-1 Pt-2 Pt-N Pt-N-1 Pt’’ Pt’’-1 Pt’’-2 Pt’’-N Pt’’ Pt’’-1 Pt’’-2 Pt’’-N Pt’’+1 Pt’’+1 : : : : : : : : : GPHR depth PHT entries Pt Pt : : : : : : : : : : : : : : : : : : Last observed phase from performance counters P0 P0 P0 P0 P0 GPHR depth Predicted Phase From GPHR(0) if no matching pattern From the corresponding PHT Prediction entry if matching pattern in PHT Inspired by a global history branch predictor Software! Implemented in the OS for on-the-fly phase prediction

32 Prediction Accuracies
100 90 80 LastValue Prediction Accuracy (%) 70 PHT:1024, GPHR:8 60 PHT:128, GPHR:8 PHT:64, GPHR:8 50 PHT:1, GPHR:8 40 gzip_log mcf_inp gcc_200 gap_ref gcc_scilab gcc_expr ammp_in gcc_166 apsi_ref parser_ref mgrid_in applu_in equake_in wupwise_ref gcc_integrate bzip2_program bzip2_source bzip2_graphic Compare to reactive approaches (Last Value prediction) GPHT performs significantly better for highly varying applications Up to 6X and on average 2.4X misprediction improvement Good performance down to 128 PHT entries Converges to last value as PHT entries  1

33 Full-System Implementation
Application Application Binary OS PMC and Phase Log Predictor State Performance Monitoring Interrupt PMI Interrupt Handler Predict Next Phase Stop/Read Counters Check/Set DVFS State Hardware I1 Restart Counters V1 VCPU Data Acquisition System Parallel Port Voltage Regulator I2 V2 Performance Counters DVFS Registers R1,2=2mΩ Power Supply Pentium-M Processor

34 PHASES & Phases & phases

35 Cantürk İşçi Gilberto Contreras Margaret Martonosi
Pt’ Pt’-1 Pt’-2 Pt’-N Pt’’ Pt’’-1 Pt’’-2 Pt’’-N : P0 Pt’+1 Pt’’+1 Pt-N-1 Pt-1 Pt-2 Pt-N Live, Runtime Phase Monitoring and Prediction on Real Systems with Application to Dynamic Power Management Cantürk İşçi Gilberto Contreras Margaret Martonosi MICRO-39 Dec Orlando,FL PrincetonUniversity October 17, 2019


Download ppt "Canturk Isci Gilberto Contreras Margaret Martonosi"

Similar presentations


Ads by Google