Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Enterprise Platforms Group Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation Harish Patil, Robert Cohn,

Similar presentations


Presentation on theme: "1 Enterprise Platforms Group Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation Harish Patil, Robert Cohn,"— Presentation transcript:

1 1 Enterprise Platforms Group Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation Harish Patil, Robert Cohn, Mark Charney, Rajiv Kapoor, Andrew Sun, Anand Karunanidhi Enterprise Platform Group Intel Corporation Presented at MICRO-37: Portland, OR, Dec. 6 th, 2004 IA32/EM64T/IPF

2 2 Enterprise Platforms Group Target: LARGE Applications With little/no manual intervention Within reasonable time Goal: Accurate Performance Prediction

3 3 Enterprise Platforms Group Instruction Counts : Some Itanium Applications SPECINT (average) SPECFP (average) RenderMan magic Fluent L2 Amber rt Ls-Dyna 3cars

4 4 Enterprise Platforms Group Whole-Program Simulation is Slow SPECINT (average) SPECFP (average) RenderMan magic Fluent L2 Amber rt Ls-Dyna 3cars

5 5 Enterprise Platforms Group Solution: Select Simulation Points Manually Randomly –Anywhere –From uniform regions Fine-grain sampling (SMARTS: CMU) By program-phase analysis (SimPoint:UCSD, iPart: Intel/MRL)

6 6 Enterprise Platforms Group Running Commercial Applications on Simulators is Hard Resource Requirements: Disks etc. –Need to modify/re-configure the simulator OS dependencies –Need support for specific kernel and device drivers License checking –Need special action

7 7 Enterprise Platforms Group Use PIN to select simulation points (PinPoints) and generate traces PIN: A dynamic-instrumentation system + A tool for writing tools + No special compiler/linker flags required Solution: Native Execution with Instrumentation

8 8 Enterprise Platforms Group PIN-Tools: Profiling, Trace Generation and more…. PIN-based profiler Simulation Point Selection Profile PinPoints PIN-based Trace Generator PIN-based Branch Predictor Your Simulator Here

9 9 Enterprise Platforms Group Simulation Point Selection with SimPoint [UCSD] Why SimPoint? Instrumentation based Microarchitecture independent Works well (results later) Applied to multi-threaded programs PIN-based profiler SimPoint Tools Basic Block Vectors PinPoints

10 10 Enterprise Platforms Group Multiple Sources of Error Goal: Accurate Performance Prediction Error Source: Phase detection Error Source: Non-repeatability Error Source: Warm-up, Modeling PinPoints Traces Simulation Stats (CPI) Phase-detection is not enough! Need Trace Generation and Simulation

11 11 Enterprise Platforms Group Main Contributions A Toolkit that automatically: – Profiles, finds phases/ simulation regions (PinPoints) –Validates that PinPoints are representative –Generates traces for simulators Available for Itanium/IA32/EM64T Evaluations in a production environment

12 12 Enterprise Platforms Group The PinPoints Toolkit PinPoints file H/W counters-based Validation (pfmon : Itanium PAPI : IA32) Compute CPI Match? Whole Program Weighted Sum for PinPoints Phase Detection + PinPoint Selection Trace Generation/Simulation

13 13 Enterprise Platforms Group Evaluations Applications: Built w/ Intel’s compilers (high opt) HPC: Fluent, AMBER, LS-Dyna, RenderMan SPEC2000: Processed 8-9 times Test Configurations: Linux (RedHat) MercedItanium (1)800 MHzL3: 2MB McKinleyItanium-2900 MHzL3: 1.5MB MadisonItanium-21.3 GHzL3: 3-6 MB

14 14 Enterprise Platforms Group PinPoints << 1% of program execution Turnaround time (Traces) : Few days PinPoints Generated Program# Retired Instructions (billions) # PinPoints (250 million insts. EACH) AMBER-rt3,9946 Fluent-m32,6258 LS-DYNA4,9326 SPECINT2000(avg.)1424 SPECFP2000(avg.)3735

15 15 Enterprise Platforms Group Results: Overview PinPoints: Whole-Program CPI prediction (SPEC2000 and HPC applications): –Average CPI prediction error ~5% –PinPoints better than random selection Predicting speedup between microarchitectures –PinPoints can be used to evaluate microarchitecture variations PinPoints Traces: Prediction of native SPEC2000 ratios –INT within 8% FP within 3% More results in the paper

16 16 Enterprise Platforms Group CPI: Actual vs. Predicted SPEC2000: Itanium-Madison

17 17 Enterprise Platforms Group SPEC2000 CPI Prediction Average Error: Madison : 2.8% Merced : 3.2% McKinley : 2.7%

18 18 Enterprise Platforms Group HPC Applications CPI Prediction Average Error: Madison : 5.0%

19 19 Enterprise Platforms Group Comparison With Random Selection [ 48 unique program runs ]

20 20 Enterprise Platforms Group Comparison With Random Selection [ 18 unique program runs ]

21 21 Enterprise Platforms Group Speedup: Merced  McKinley SPEC2000

22 22 Enterprise Platforms Group PinPoints Speedup Prediction: SPEC2000: Merced  McKinley

23 23 Enterprise Platforms Group PinPoints: Speedup Prediction Across Multiple Microarchitectures Same Binaries/PinPoints

24 24 Enterprise Platforms Group Putting it All Together: From PinPoints to Projections PinPoints Traces Simulation Stats (CPI) Does simulation of traces for PinPoints predict native performance? Error Source: Phase detection Error Source: Non-repeatability Error Source: Warm-up, Modeling Error: Cumulative

25 25 Enterprise Platforms Group CPI Prediction with Simulation SPEC2000: Itanium Madison

26 26 Enterprise Platforms Group Native SPEC2000 Ratios [Spring 2004] Itanium: Madison 1.5GHz/6MB L3

27 27 Enterprise Platforms Group Performance Prediction from PinPoints Traces Itanium: Madison 1.5GHz/6MB L3

28 28 Enterprise Platforms Group Summary PinPoints toolkit : Automatic simulation region selection, tracing, and validation Dynamic instrumentation (PIN )  LARGE programs PinPoints: << 1% of execution Capture whole-program CPI –Average error < 5% for SPEC2000, HPC apps. –Better than random selection PinPoints traces: Predict SPEC2000 Ratios –INT within 8% FP within 3%

29 29 Enterprise Platforms Group Try it out! (PIN + PinPoints) toolkit : New

30 30 Enterprise Platforms Group Backup: Simulator Warm-up Strategy 1: Large slice-size (250 million instructions) –Too coarse-grain for phase detection –Too much simulation time Strategy 2: 7 warm-up traces per simulation trace (30 million instructions) Art (SPECFP2000): First pinpoint touches most of the working set –Simulate all pinpoint traces in succession


Download ppt "1 Enterprise Platforms Group Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation Harish Patil, Robert Cohn,"

Similar presentations


Ads by Google