Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison

Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison {alaa,david}@cs.wisc.eduhttp://www.cs.wisc.edu/multifacet/

HPCA 2003Alaa Alameldeen and David Wood2 Motivation  Experimental scientists use statistics  Computer architects in simulation experiments don’t!  Why ignore statistics? Simulations are deterministic Simulations are deterministic  This can lead to wrong conclusions!

HPCA 2003Alaa Alameldeen and David Wood3 Workload Variability OLTP

HPCA 2003Alaa Alameldeen and David Wood4 Workload Variability OLTP Slower memory is better!

HPCA 2003Alaa Alameldeen and David Wood5 What Went Wrong?  Many possible executions for each configuration  Why? Different timing effects OS scheduling decisions OS scheduling decisions Different orders of lock acquisition Different orders of lock acquisition Different transaction mixes Different transaction mixes  This is magnified by short simulations  Variability can lead to wrong conclusions

HPCA 2003Alaa Alameldeen and David Wood6 Overview  Variability is a real phenomenon for multi- threaded workloads Runs from same initial state can be different Runs from same initial state can be different  Variability is a challenge for simulations Simulations are short Simulations are short  Our solution accounts for variability Multiple runs, statistical techniques Multiple runs, statistical techniques

HPCA 2003Alaa Alameldeen and David Wood7 Outline  Motivation and Overview  Variability in Real Systems Time and Space Variability Time and Space Variability  Variability in Simulations  Accounting for Variability  Conclusions

HPCA 2003Alaa Alameldeen and David Wood8 What is Variability?  Differences between multiple estimates of a workload’s performance  Time Variability: Performance changes during different phases of a single run Performance changes during different phases of a single run  Space Variability: Runs starting from the same state follow different execution paths Runs starting from the same state follow different execution paths

HPCA 2003Alaa Alameldeen and David Wood9 Time Variability in Real Systems OLTP One-second intervals

HPCA 2003Alaa Alameldeen and David Wood10 Time Variability Example (Cont’d)  How is this handled in real experiments? Solution: Run your experiment long enough! Solution: Run your experiment long enough! OLTP One-minute intervals

HPCA 2003Alaa Alameldeen and David Wood11 Space Variability in Real Systems OLTP One-second averages 5 runs

HPCA 2003Alaa Alameldeen and David Wood12 Space Variability Example (Cont’d)  How is this handled in real experiments? Same Solution: Run your experiment long enough! Same Solution: Run your experiment long enough! 16-day simulation OLTP One-minute averages 5 runs

HPCA 2003Alaa Alameldeen and David Wood13 Outline  Motivation and Overview  Variability in Real Systems  Variability in Simulations Simulation Infrastructure Simulation Infrastructure Injecting Randomness Injecting Randomness The Wrong Conclusion Ratio The Wrong Conclusion Ratio  Accounting for Variability  Conclusions

HPCA 2003Alaa Alameldeen and David Wood14 Simulation Infrastructure  Workloads Two scientific and five commercial benchmarks Two scientific and five commercial benchmarks  Target System: E10000-like 16-node system  Full System Simulation Virtutech Simics running Solaris 8 on SPARC V9 Virtutech Simics running Solaris 8 on SPARC V9 A blocking processor model (Simics) A blocking processor model (Simics) An OoO processor model (TFSim – Mauer et al., SIGMETRICS’02) An OoO processor model (TFSim – Mauer et al., SIGMETRICS’02)  Memory system simulator MOSI invalidation-based broadcast coherence protocol (Martin et al., HPCA-02) MOSI invalidation-based broadcast coherence protocol (Martin et al., HPCA-02)

HPCA 2003Alaa Alameldeen and David Wood15 Simulating Space Variability?  Simulations are deterministic  Variability cannot be ignored for multi- threaded applications One execution may not be representative One execution may not be representative Execution paths affect simulation conclusions Execution paths affect simulation conclusions  We need to obtain a space of results

HPCA 2003Alaa Alameldeen and David Wood16 Injecting Randomness  We introduce artificial random perturbations in each simulation run  For each memory access, latency in nanoseconds becomes Latency + r (r = -2, -1, 0, 1, 2 nanoseconds, uniform dist.)  Roughly models contention due to DMA traffic  Other methods are possible

HPCA 2003Alaa Alameldeen and David Wood17 Simulated Space Variability  Space variability exists in our benchmarks 20 runs ~10 hrs sim.

HPCA 2003Alaa Alameldeen and David Wood18 Quantifying Variability: The Wrong Conclusion Ratio (WCR)  WCR (16,32) = 18%  WCR (16,64) = 7.5%  WCR (32,64) = 26% OLTP 20 runs 50 Xacts

HPCA 2003Alaa Alameldeen and David Wood19 Outline  Motivation and Overview  Variability in Real Systems  Variability in Simulations  Accounting for Variability  Conclusions

HPCA 2003Alaa Alameldeen and David Wood20 Confidence Intervals  Definition: Range of values expected to include population parameter (e.g. mean) Range of values expected to include population parameter (e.g. mean)  Confidence Probability: Probability that true mean lies inside confidence interval Probability that true mean lies inside confidence interval  For the same confidence probability: Sample Size ↑ → Confidence Interval ↓ Sample Size ↑ → Confidence Interval ↓

HPCA 2003Alaa Alameldeen and David Wood21 Accounting for Space Variability OLTP

HPCA 2003Alaa Alameldeen and David Wood22 Accounting for Space Variability  Simple solution: Estimate #runs such that confidence intervals do not overlap  Tests of hypotheses can be used (paper) OLTP

HPCA 2003Alaa Alameldeen and David Wood23 Conclusions  Short runs of multi-threaded workloads exhibit variability  Variability can lead to wrong simulation conclusions  Our Solution: Injecting randomness Injecting randomness Multiple runs Multiple runs Apply statistical techniques Apply statistical techniques

HPCA 2003Alaa Alameldeen and David Wood24 Backup Slides

HPCA 2003Alaa Alameldeen and David Wood25 Effects of OS Scheduling

HPCA 2003Alaa Alameldeen and David Wood26 WCR Definition  Percentage of comparison simulation experiments that reach a wrong conclusion  The correct conclusion is the relationship between averages of the two populations  WCR can be used to estimate the wrong conclusion probability for single experiments

HPCA 2003Alaa Alameldeen and David Wood27 Confidence Intervals - Equations  The confidence interval for the mean of a normally distributed infinite population:  Sample Size needed to limit mean relative error to r:

HPCA 2003Alaa Alameldeen and David Wood28 Hypothesis Testing  Tests whether there is no difference between two population means Hypothesis: μ 32 = μ 64 tests whether the two means of the 32 and 64 ROB configurations are different Hypothesis: μ 32 = μ 64 tests whether the two means of the 32 and 64 ROB configurations are different  Hypothesis is tested using sample means and variances  If hypothesis rejected  Our conclusion is significant

HPCA 2003Alaa Alameldeen and David Wood29 Accounting for Time Variability  Is time variability caused by the same effects that cause space variability? Use Analysis of Variance (ANOVA) Use Analysis of Variance (ANOVA)  If time variability is caused by different effects, we need to obtain a time sample Observations obtained from different starting points Observations obtained from different starting points

HPCA 2003Alaa Alameldeen and David Wood30 Multi-threaded Workloads and Simulation  Multi-threaded workloads are important Workloads for commercial servers Workloads for commercial servers New architectures support multi-threading New architectures support multi-threading  Performance metrics are different from traditional benchmarks Throughput-oriented (transactions) Throughput-oriented (transactions) IPC is not appropriate (idle time!) IPC is not appropriate (idle time!)  Simulation Challenge: Comparing systems running multi-threaded applications

HPCA 2003Alaa Alameldeen and David Wood31 Simulation of Multi-threaded Workloads  Simulation is slow! We cannot simulate the whole workload We cannot simulate the whole workload  Solution: Run for a fixed number of transactions Run for a fixed number of transactions Measure the per-transaction runtime (cycles per transaction) Measure the per-transaction runtime (cycles per transaction) Use to compare different systems Use to compare different systems

Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison

Similar presentations

Presentation on theme: "Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison

Similar presentations

Presentation on theme: "Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison"— Presentation transcript:

Similar presentations

About project

Feedback