Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI.

Similar presentations


Presentation on theme: "Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI."— Presentation transcript:

1 Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI

2 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 2 Previously…  Runtime processor power monitoring and estimation  Power Phase Behavior of programs (Power Vectors) POWER CLIENT POWER SERVER Gcc GzipVpr Vortex Gap Crafty MeasuredModeled

3 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 3 Previously…  Runtime processor power monitoring and estimation  Power Phase Behavior of programs (Power Vectors)

4 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 4 Today!  Phase detection on real systems:  Variability effects and potentials for repeatability  Virtual memory behavior – Tuning  Initial results  What’s going on?  BBVs – PMCs – PVs… and POWER  Simple metric prediction studies  Short term vs. long term MAJOR MINOR MAYBE

5 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 5 Phase Detection with Power Vectors  Initial idea was to look at phase distributions of app-s and use some signature analysis to detect/predict phases  HOWEVER:  Multiple runs -inevitably- exhibit different real system behavior  The quantities & durations vary  The phase distributions vary  Metric Var  Time Var

6 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 6 Variability Effects in Real System Behavior  A direct apples to apples comparison of phase signatures is not very relevant in real world!

7 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 7 Ammp and Apples  Although obvious to the eye, comparing phase sequences directly does not reveal the recurrence clearly!

8 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 8 How do Phase Distributions Compare? Ex: 2 runs of gcc

9 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 9 How do Phase Distributions Compare? Ex: 2 runs of gcc

10 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 10 We Got Ourselves a Problem:  How do we extract this recurrent behavior information?  Speech/Humming recognition:  Stored libraries, signal stats  Pitch tracking  Image/Biomedical:  Image warping  Registration/Mutual information  Architects:  Simple to apply online  Implementable w/o massive state & combinationals

11 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 11 Interesting Observation with Transitions  Trying to detect application from behavior  Upper Case:  Hit!  Lower Case:  False alarm?  Tracking phase transitions rather than phase sequences proves to be more useful in detecting recurrent behavior* Gcc1-Gcc2 Gcc-Equake

12 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 12 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 T init #1 Apply glitch/gradient filtering T init #2 T gg #1T gg #2 Apply near-neighbor blurring T ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation

13 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 13 Sampling Effects: Glitches & Gradients  Nothing happens without disturbances  Glitches  Glitch: Instability where before & after is same  Spurious Transitions  Nothing happens instantaneously  Gradients  Gradient: Instability where before & after is different  A single true trans-n

14 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 14 Glitch/Gradient Filtering  Very simple: no consecutive transitions  Leads to large reductions in transition count  We call these “Refined Transitions (T gg )”

15 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 15 (Also Helps with Threshold Choices)

16 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 16 Time Shifts  We have binary information  We can do cheaper than shifted correlation coeff-s  Using Cross-Correlations show equally useful results  Easily implementable  Ex: Matching and Mismatch cases, and “The Peak” Gcc1-Gcc2 Gcc-Equake

17 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 17  Observation: Dilations exist as small jitters (few samples)  Proposed Solution: “Near-Neighbor Blurring”  Blur edges slightly  Consider transitions as distributions around their actual locations  Tolerance: Spread of this distribution, [t-x, t+x] samples  Ex: Matching improvement with tolerance=4: Time Dilations 00100000010010000000000 01000000010000100001000.6.81.6.4.6.81 1.6.4.2000000 01000000010000100001000 run1 run2 run1 run2 Mismatch ! Match!

18 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 18 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 T init #1 Apply glitch/gradient filtering T init #2 T gg #1T gg #2 Apply near-neighbor blurring T ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation

19 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 19 Results  How do we quantify the strength of the peak?  Matching Score:  Detection Results: (green: highest match; red: highest mismatch)

20 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 20 Receiver Operating Characteristics  Our best detection scheme (tolerance=1) achieves 100% hit detection with <5% false alarms.  (For a uniform threshold!)

21 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 21 Comparison of Methods  Comparing 3 cases:  Original (Value Based) Phases vs. Refined Trans-ns vs. Near-Nbr Blurred Trans-ns  In all cases transitions perform better  In almost all cases near-neighbor blurring improves detection

22 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 22 Conclusions  Phase-recurrent behavior detection on real systems has interesting problems resulting from system induced variability  Looking at phase transition information in part improves detection capabilities  Supporting methods such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures

23 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 23 Today!  Phase detection on real systems:  Variability effects and potentials for repeatability  Virtual memory behavior – Tuning  Initial results  What’s going on?  BBVs – PMCs – PVs… and POWER  Simple metric prediction studies  Short term vs. long term

24 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 24 Workload Phases  Memory Behavior?  Few of the Inspirations:  Redhat Magazine Issue #1 [Dec 2004]  Dynamically Tracking Page Miss Ratio Curve [ASPLOS 2005]  Gokul Kandiraju [PhD Thesis 2004]  Can we track phase behavior from PMCs and VM related stats to dynamically manage memory behavior?  Less page locality  fetch less contiguous pages at once  Recurring reference with high reuse distance  launder less aggressively  Targets  Exec time & Energy IndicatorActionEffect James Donald -

25 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 25 Platform  P4, No SMT, 256K Mem, Linux 2.4.7-10  SPEC2K is designed to fit in 256K  Choose High Memory Benchmarks + Multiprogramming  Multiprogramming combinations of these leads to lots of thrashing

26 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 26 Effect of Thrashing with Multiprogramming  For most cases, it leads to 5-10% power/performance penalty  Applu+Apsi!  6X Time  2.5X Energy  Non thrashing combinations, achieve 5-10% improvement James Donald -

27 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 27 Action  Effect  Non-intrusive tuning possibilities:  Kswapd:tries_base Max # of pages swapout daemon tries to free at once  Kswapd:swap_cluster # of pages swapout daemon writes at once  Page-cluster: Log 2 (# of contiguous pages) kernel reads at once at a page fault  Intrusive tuning possibilities:  Page scanning period (Overhead if tasks fit in Mem)  Page age counters (reuse vs. pollution)  Inactive-Clean Percentage (balance I/O and Mem demand)  Task memory allocation (Workload dependent Mem demand) IndicatorActionEffect James Donald -

28 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 28 Non-intrusive Results  Gzip: gzip + gzip + gzip  Gap: gap + gzip  Bzip2: bzip2 + bzip2  Tries_base and swap_cluster have no visible effect  Page-cluster shows ~7% improvement wrt default James Donald -

29 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 29 Conclusions and Todos  Multiprogramming involving thrashing has a lot of potential for improvement for performance/power  Experimented cases don’t show promising actions  Intrusive actions may be more useful leading to effective actions as well as better (per task) tracking  NEXT STEPS:  Looking into mm for potential dynamic tunings  Defining indicators tracking relevant behavior Page miss ratio / Swap rates / Bus Utilization  Q: Is There any Potential? James Donald -

30 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 30 Tomorrow!  Phase detection on real systems:  Variability effects and potentials for repeatability  Virtual memory behavior – Tuning  Initial results  What’s going on?  BBVs – PMCs – PVs… and POWER  Simple metric prediction studies  Short term vs. long term

31 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 31 Comparing Phase Methods for Power  All lead to different interesting characterizations  How do these compare in terms of power representation?  Is there a dominant method or does a (hierarchical) combination work better?  We specifically look at BBVs & PMC-Power Vectors Similarity Based On: Metrics (IPC, EPI, etc) Hardware Performance Vectors BBVs, Working Sets ProceduresBranches Sampling Quanta: Code/Time/Energy intervals From Performance Monitoring Counters From Sampled PC Traces

32 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 32 Different Phases Ex: Dcache Microkernel  Specify L1 hit rate, generate ~desired hits via random linked list traversal A C M P Z Cache Size

33 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 33 Dcache Performance Traces  Each hit rate range is obvious  Trends NOT identical across metrics:  Linear L1 misses vs. Nonlinear IPC  FOR A SINGLE METRIC: How you capture phases depends on metric and chosen threshold

34 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 34 Dcache PC Traces  No visible phases from PC samples  Address Space Sampling alone is NOT sufficient!!

35 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 35 Experiment Setup  PIN kit 1795  3 level Trace instrumentation  ~Every user trace: Conditional inlined trace count  Every 50-200K Trace call: Sample EIP  Every 5-20M Trace call: Generate BBV & Collect PMCs & Read PWR history  Constraint: Instrumentation should not overwhelm Power variations!!  BBV Generation:  Sample BBL heads  hash into 32 dimensions (based on Jenkins)  PMC Reading:  Single rotation subset  Sample via ‘popen’s due to platform conflicts  Power Reading:  Read from serial device buffer  No polling possible  disable device at major instrumentation & exhaust buffer

36 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 36 BBV Results  Is sampling good enough? Are they Meaningful?  B. Calder’s Full Blown BBV SimMatrices  Our sampled & hashed BBV Simmatrices

37 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 37 Power Results  Do we still have the hook on power variability?  Native  From PIN  Native  From PIN

38 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 38 Currently…  Still need to verify benchmarks for power and validity  Constructing power vectors with the reduced set  Applying symmetric phase analyses to BBVs and PMCs  Power representation of phases wrt measurements  90-10 Prediction with regression trees

39 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 39 Today!  Phase detection on real systems:  Variability effects and potentials for repeatability  Virtual memory behavior – Tuning  Initial results  What’s going on?  BBVs – PMCs – PVs… and POWER  Simple metric prediction studies  Short term vs. long term

40 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 40 Metric (IPC) Value Prediction  No big challenge to get good results, but improving for edges is interesting  Statistical Predictor: Transition guided, history based (EWMA) IPC Prediction  Instead of fixed history window, use stable regions between transitions as your history in a circular buffer  Transitions based on a threshold Threshold = 0   “Last Value Predictor”  Our experience:  Variabilities are bursty transitions  There are stable regions with probable gradients between transitions

41 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 41 Ammp, thr=0% (Last Value)

42 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 42 Ammp, thr=10%

43 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 43 Using Stability Considerations (8) in IPC Pred-ns

44 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 44 Predicting Durations  X=f(x) approach:  F(x) = x, x/2, x/8, …  Initial Stability requirement: 2,8,…  Table based?  Idea was: At each transition: predict once for duration based on history:  Log(prev_duration) = key val-s [0,1,2,3,4,5] History:  |5|3|5|3|5|  3  |1|3|5|1|3|  5 -need to filter bursts somehow -Partial matchings??  NOT EXPLORED!!

45 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 45 Ammp Duration Prediction  Predict Based on F(x)=x/8  Stability Criterion=8 samples  Extend duration  stability continues  IPC based on last value  Predictions only at checkpoints

46 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 46 Long Term IPC Prediction with Gradients  Last value not very useful at long term  Instead of 0 order, consider a 1 st order prediction:  Need additional ΔIPC information  Next IPC = Current IPC + ΔIPC  Ex: F(x)=x/8

47 Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 47 Improvements?  Using Prediction Probability Tables:  P{N more|20 stable @ IPC}  Ex: Vortex  Using adaptive functions based on history  Table based function approaches NP(N|20) 0-90.111111111 10.0-790.577777778 79-990.022222222 100-10000.288888889 1000+0


Download ppt "Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI."

Similar presentations


Ads by Google