Presentation is loading. Please wait.

Presentation is loading. Please wait.

Profiling S3D on Cray XT3 using TAU Sameer Shende

Similar presentations


Presentation on theme: "Profiling S3D on Cray XT3 using TAU Sameer Shende"— Presentation transcript:

1 Profiling S3D on Cray XT3 using TAU Sameer Shende tau-team@cs.uoregon.edu

2 TAU Performance SystemProfiling S3D Harness2 Acknowledgements  Alan Morris [UO]  Kevin Huck [UO]  Allen D. Malony [UO]  Kenneth Roche [ORNL]  Bronis R. de Supinski [LLNL]

3 TAU Performance SystemProfiling S3D Harness3 TAU Parallel Performance System  http://www.cs.uoregon.edu/research/tau/  Multi-level performance instrumentation  Multi-language automatic source instrumentation  Flexible and configurable performance measurement  Widely-ported parallel performance profiling system  Computer system architectures and operating systems  Different programming languages and compilers  Support for multiple parallel programming paradigms  Multi-threading, message passing, mixed-mode, hybrid

4 TAU Performance SystemProfiling S3D Harness4 TAU Performance System Architecture event selection

5 TAU Performance SystemProfiling S3D Harness5 TAU Performance System Architecture

6 TAU Performance SystemProfiling S3D Harness6 Program Database Toolkit (PDT) Application / Library C / C++ parser Fortran parser F77/90/95 C / C++ IL analyzer Fortran IL analyzer Program Database Files IL DUCTAPE PDBhtml SILOON CHASM TAU_instr Program documentation Application component glue C++ / F90/95 interoperability Automatic source instrumentation

7 TAU Performance SystemProfiling S3D Harness7 PAPI  Performance Application Programming Interface  The purpose of the PAPI project is to design, standardize and implement a portable and efficient API to access the hardware performance monitor counters found on most modern microprocessors.  Parallel Tools Consortium project  Developed by University of Tennessee, Knoxville  http://icl.cs.utk.edu/papi/

8 TAU Performance SystemProfiling S3D Harness8 S3D - Building with TAU  Change name of compiler in build/make.XT3  ftn=> tau_f90.sh  cc => tau_cc.sh  Set compile time environment variables  setenv TAU_MAKEFILE /spin/proj/perc/TOOLS/tau_latest/xt3/lib/ Makefile.tau-callpath-multiplecounters-mpi-papi-pdt-pgi  Choose callpath, PAPI counters, MPI profiling, PDT for source instrumentation  setenv TAU_OPTIONS ‘-optTauSelectFile=select.tau -optPreProcess’  Selective instrumentation file eliminates instrumentation in lightweight routines  Pre-process Fortran source code using cpp before compiling  Set runtime environment variables for instrumentation control and event PAPI counter selection in job submission script:  export TAU_THROTTLE=1  export COUNTER1 GET_TIME_OF_DAY  export COUNTER2 PAPI_FP_INS  export COUNTER3 PAPI_L1_DCM  export COUNTER4 PAPI_RES_STL  export COUNTER5 PAPI_L2_DCM

9 TAU Performance SystemProfiling S3D Harness9 Selective Instrumentation in TAU % cat select.tau BEGIN_EXCLUDE_LIST MCADIF GETRATES TRANSPORT_M::MCAVIS_NEW MCEDIF MCACON CKYTCP THERMCHEM_M::MIXCP THERMCHEM_M::MIXENTH THERMCHEM_M::GIBBSENRG_ALL_DIMT CKRHOY MCEVAL4 THERMCHEM_M::HIS THERMCHEM_M::CPS THERMCHEM_M::ENTROPY END_EXCLUDE_LIST BEGIN_INSTRUMENT_SECTION loops routine="#" END_INSTRUMENT_SECTION

10 TAU Performance SystemProfiling S3D Harness10 TAU’s ParaProf Profile Browser - Manager Derived Metrics Flops = PAPI_FP_INS/wallclock time

11 TAU Performance SystemProfiling S3D Harness11 Main Window - 8 cpus (MPI Ranks 0-7) Some routines execute on different sets of processors

12 TAU Performance SystemProfiling S3D Harness12 Mean Profile Over 8 cpus -- Exclusive Time

13 TAU Performance SystemProfiling S3D Harness13 Mean Percentage -- Exclusive Time

14 TAU Performance SystemProfiling S3D Harness14 Loop Level Profile With PAPI Counter Data

15 TAU Performance SystemProfiling S3D Harness15 ParaProf’s Source Browser

16 TAU Performance SystemProfiling S3D Harness16 Exclusive MFLOPS

17 TAU Performance SystemProfiling S3D Harness17 FP Instructions per L1 Data Cache Miss (rank 0)

18 TAU Performance SystemProfiling S3D Harness18 Level 1 Data Cache Misses

19 TAU Performance SystemProfiling S3D Harness19 Callpath Profiles

20 TAU Performance SystemProfiling S3D Harness20 Callpath Profiles: Flops, Resource Stalls

21 TAU Performance SystemProfiling S3D Harness21 Callpath Thread Relations Window parent routine children

22 TAU Performance SystemProfiling S3D Harness22 Flat Profile

23 TAU Performance SystemProfiling S3D Harness23 TAU’s ParaProf Profile Browser - Manager Different sections of code within the same routine execute on odd and even processors!

24 TAU Performance SystemProfiling S3D Harness24 3D Window: Rank, Routine, Time, Instructions

25 TAU Performance SystemProfiling S3D Harness25 3D Window: Variations in FP/L1 DCM ratios

26 TAU Performance SystemProfiling S3D Harness26 Getting Access to TAU on Jaguar  set path=(/spin/proj/perc/TOOLS/tau_latest/x86_64/bin $path)  Choose Stub Makefiles (TAU_MAKEFILE env. var.) from /spin/proj/perc/TOOLS/tau_latest/xt3/lib/Makefile.*  Makefile.tau-mpi-pdt-pgi (flat profile)  Makefile.tau-mpi-pdt-pgi-trace (event trace, for use with Vampir)  Makefile.tau-callpath-mpi-pdt-pgi (single metric, callpath profile)  Binaries of S3D can be found in:  ~sameer/scratch/S3D-BINARIES withtau »papi, multiplecounters, mpi, pdt, pgi options without_tau


Download ppt "Profiling S3D on Cray XT3 using TAU Sameer Shende"

Similar presentations


Ads by Google