Presentation is loading. Please wait.

Presentation is loading. Please wait.

Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.

Similar presentations


Presentation on theme: "Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University."— Presentation transcript:

1 Allen D. Malony, Sameer Shende {malony,sameer}@cs.uoregon.edu Department of Computer and Information Science Computational Science Institute University of Oregon Performance Engineering Technology for Complex Scientific Component Software

2 Pasadena CCA Meeting 2 Jan. 16, 2003 Outline  Overview of the TAU project  Performance Engineered Component Software  CCA Performance Observation Component  CCAFFEINE (Classic C++)  SIDL  Applications (SC’02 Demos):  Optimizer Component [Craig Rasmussen, Matt Sottile]  Combustion Component [Jaideep Ray]  Concluding remarks

3 Pasadena CCA Meeting 3 Jan. 16, 2003 TAU Performance System Framework  Tuning and Analysis Utilities  Performance system framework for scalable parallel and distributed high- performance computing  Targets a general complex system computation model  nodes / contexts / threads  Multi-level: system / software / parallelism  Measurement and analysis abstraction  Integrated toolkit for performance instrumentation, measurement, analysis, and visualization  Portable, configurable performance profiling/tracing facility  Open software approach  University of Oregon, LANL, FZJ Germany  http://www.cs.uoregon.edu/research/paracomp/tau http://www.cs.uoregon.edu/research/paracomp/tau

4 Pasadena CCA Meeting 4 Jan. 16, 2003 General Complex System Computation Model  Node: physically distinct shared memory machine  Message passing node interconnection network  Context: distinct virtual memory space within node  Thread: execution threads (user/system) in context memory Node VM space Context SMP Threads node memory … … Interconnection Network Inter-node message communication * * physical view model view

5 Pasadena CCA Meeting 5 Jan. 16, 2003 TAU Performance System Architecture EPILOG Paraver

6 Pasadena CCA Meeting 6 Jan. 16, 2003 TAU Status  Instrumentation supported:  Source, preprocessor, compiler, MPI, runtime, virtual machine  Languages supported:  C++, C, F90, Java, Python  HPF, ZPL, HPC++, pC++...  Packages supported:  PAPI [UTK], PCL [FZJ] (hardware performance counter access),  Opari, PDT [UO,LANL,FZJ], DyninstAPI [U.Maryland] (instrumentation),  EXPERT, EPILOG[FZJ],Vampir[Pallas], Paraver [CEPBA] (visualization)  Platforms supported:  IBM SP, SGI Origin, Sun, HP Superdome, HP-Compaq ES,  Linux clusters (IA-32, IA-64, PowerPC, Alpha), Apple OS X, Windows,  Hitachi SR8000, NEC SX, Cray T3E...  Compilers suites supported:  GNU, Intel KAI (KCC, KAP/Pro), Intel, SGI, IBM, Compaq,HP, Fujitsu, Hitachi, Sun, Apple, Microsoft, NEC, Cray, PGI, Absoft, …  Thread libraries supported:  Pthreads, SGI sproc, OpenMP, Windows, Java, SMARTS

7 Pasadena CCA Meeting 7 Jan. 16, 2003 Program Database Toolkit Application / Library C / C++ parser Fortran 77/90 parser C / C++ IL analyzer Fortran 77/90 IL analyzer Program Database Files IL DUCTAPE PDBhtml SILOON CHASM TAU_instr Program documentation Application component glue C++ / F90 interoperability Automatic source instrumentation

8 Pasadena CCA Meeting 8 Jan. 16, 2003 Program Database Toolkit (PDT)  Program code analysis framework for developing source-based tools for C99, C++ and F90  High-level interface to source code information  Widely portable:  IBM, SGI, Compaq, HP, Sun, Linux clusters,Windows, Apple, Hitachi, Cray T3E...  Integrated toolkit for source code parsing, database creation, and database query  commercial grade front end parsers (EDG for C99/C++, Mutek for F90)  Intel/KAI C++ headers for std. C++ library distributed with PDT  portable IL analyzer, database format, and access API  open software approach for tool development  Target and integrate multiple source languages  Used in CCA for automated generation of SIDL  Used in TAU to build automated performance instrumentation tools (tau_instrumentor)  Used in CHASM, XMLGEN, Component method signature extraction,…

9 Pasadena CCA Meeting 9 Jan. 16, 2003 Performance Database Framework... Raw performance data PerfDML data description Performance analysis programs PerfDML translators Performance analysis and query toolkit ORDB PostgreSQL XML profile data representation Multiple experiment performance database

10 Pasadena CCA Meeting 10 Jan. 16, 2003 TAU’s Runtime Monitor TAU uses SCIRun [U. Utah] for visualization of performance data (online/offline)

11 Pasadena CCA Meeting 11 Jan. 16, 2003 Performance-Engineered Component Software  Intra- and Inter-component performance engineering  Four general parts:  Performance observation  integrated measurement and analysis  Performance query and monitoring  runtime access to performance information  Performance control  mechanisms to alter performance observation  Performance knowledge  characterization and modeling  Consistent with component architecture / implementation

12 Pasadena CCA Meeting 12 Jan. 16, 2003  Extend the programming and execution environment to be performance observable and performance aware Main Idea: Extend Component Design performance observation ports performance knowledge ports … Performance Knowledge Component Core … variants Performance Observation  empirical  analytical … … Component Performance Repository repository service ports component ports  measurement  analysis

13 Pasadena CCA Meeting 13 Jan. 16, 2003  Performance measurement integration in component form  Functional extension of original component design ( )  Include new component methods and ports ( ) for other components to access measured performance data  Allow original component to access performance data  Encapsulate as tightly-couple and co-resident performance observation object  POC “provides” port allow use of optimized interfaces ( ) to access ``internal'' performance observations performance observation ports … Component Core … variants Performance Observation … component ports  measurement  analysis Performance Observation and Component

14 Pasadena CCA Meeting 14 Jan. 16, 2003 Performance Knowledge  Describe and store “known” component performance  Benchmark characterizations in performance database  Empirical or analytical performance models  Saved information about component performance  Use for performance-guided selection and deployment  Use for runtime adaptation  Representation must be in common forms with standard means for accessing the performance information  Compatible with component architecture

15 Pasadena CCA Meeting 15 Jan. 16, 2003  Performance knowledge storage  Implement in component architecture framework  Similar to CCA component repository  Access by component infrastructure  View performance knowledge as component (PKC)  PKC ports give access to performance knowledge  to other components, back to original component  Static/dynamic component control and composition  Component composition performance knowledge Component Performance Repository performance knowledge ports Performance Knowledge  empirical  analytical … Component Performance Repository repository service ports

16 Pasadena CCA Meeting 16 Jan. 16, 2003 Component Composition Performance  Performance of component-based scientific applications depends on interplay of component functions and the computational resources available  Management of component compositions throughout execution is critical to successful deployment and use  Identify key technological capabilities needed to support the performance engineering of component compositions  Two model concepts  performance awareness  performance attention

17 Pasadena CCA Meeting 17 Jan. 16, 2003 Performance Awareness of Component Ensembles  Composition performance knowledge and observation  Composition performance knowledge  Can come from empirical and analytical evaluation  Can utilize information provided at the component level  Can be stored in repositories for future review  Extends the notion of component observation to ensemble-level performance monitoring  Associate monitoring components to component grouping  Build upon component-level observation support  Performance integrators and routers  Use component framework mechanisms

18 Pasadena CCA Meeting 18 Jan. 16, 2003 Performance Engineering Support in CCA  Define a standard observation component interface for:  Performance measurement  Performance data query  Performance control (enable/disable)  Implement performance interfaces for use in CCA  TAU performance system  CCA component frameworks (CCAFFEINE, SIDL/Babel)  Demonstrations  Optimizing component  picks from a set of equivalent CCA port implementations  Flame reaction-diffusion application

19 Pasadena CCA Meeting 19 Jan. 16, 2003 CCA Performance Observation Component  Design measurement port and measurement interfaces  Timer  start/stop  set name/type/group  Control  enable/disable groups  Query  get timer names  metrics, counters, dump to disk  Event  user-defined events

20 Pasadena CCA Meeting 20 Jan. 16, 2003 CCA C++ (CCAFFEINE) Performance Interface namespace performance { namespace ccaports { class Measurement: public virtual classic::gov::cca::Port { public: virtual ~ Measurement (){} /* Create a Timer interface */ virtual performance::Timer* createTimer(void) = 0; virtual performance::Timer* createTimer(string name) = 0; virtual performance::Timer* createTimer(string name, string type) = 0; virtual performance::Timer* createTimer(string name, string type, string group) = 0; /* Create a Query interface */ virtual performance::Query* createQuery(void) = 0; /* Create a user-defined Event interface */ virtual performance::Event* createEvent(void) = 0; virtual performance::Event* createEvent(string name) = 0; /* Create a Control interface for selectively enabling and disabling * the instrumentation based on groups */ virtual performance::Control* createControl(void) = 0; }; } Measurement port Measurement interfaces

21 Pasadena CCA Meeting 21 Jan. 16, 2003 CCA Timer Interface Declaration namespace performance { class Timer { public: virtual ~Timer() {} /* Implement methods in a derived class to provide functionality */ /* Start and stop the Timer */ virtual void start(void) = 0; virtual void stop(void) = 0; /* Set name and type for Timer */ virtual void setName(string name) = 0; virtual string getName(void) = 0; virtual void setType(string name) = 0; virtual string getType(void) = 0; /* Set the group name and group type associated with the Timer */ virtual void setGroupName(string name) = 0; virtual string getGroupName(void) = 0; virtual void setGroupId(unsigned long group ) = 0; virtual unsigned long getGroupId(void) = 0; }; } Timer interface methods

22 Pasadena CCA Meeting 22 Jan. 16, 2003 Use of Observation Component in CCA Example #include "ports/Measurement_CCA.h"... double MonteCarloIntegrator::integrate(double lowBound, double upBound, int count) { classic::gov::cca::Port * port; double sum = 0.0; // Get Measurement port port = frameworkServices->getPort ("MeasurementPort"); if (port) measurement_m = dynamic_cast (port); if (measurement_m == 0){ cerr << "Connected to something other than a Measurement port"; return -1; } static performance::Timer* t = measurement_m->createTimer( string("IntegrateTimer")); t->start(); for (int i = 0; i getRandomNumber (); sum = sum + function_m->evaluate (x); } t->stop(); }

23 Pasadena CCA Meeting 23 Jan. 16, 2003 Measurement Port Implementation  Use of Measurement port (i.e., instrumentation)  independent of choice of measurement tool  independent of choice of measurement type  TAU performance observability component  Implements the Measurement port  Implements Timer, Control, Query, Control  Port can be registered with the CCAFEINE framework  Components instrument to generic Measurement port  Runtime selection of TAU component during execution  TauMeasurement_CCA port implementation uses a specific TAU library for choice of measurement type

24 Pasadena CCA Meeting 24 Jan. 16, 2003 What’s Going On Here? TAU API runtime TAU performance data TAU API application component performance component other API … Alternative implementations of performance component Two instrumentation paths using TAU API Two query and control paths using TAU API application component

25 Pasadena CCA Meeting 25 Jan. 16, 2003 SIDL Interface for Performance Component version performance 1.0; package performance { interface Timer { /* Start/stop the Timer */ void start(); void stop(); /* Set/get the Timer name */ void setName(in string name); string getName(); /* Set/get Timer type information (e.g., signature of the routine) */ void setType(in string name); string getType(); /* Set/get the group name associated with the Timer */ void setGroupName(in string name); string getGroupName(); /* Set/get the group id associated with the Timer */ void setGroupId(in long group); long getGroupId(); } …

26 Pasadena CCA Meeting 26 Jan. 16, 2003 Simple Runtime Performance Optimization  Components are “plug-and-play”  One can choose from a set of equivalent port implementations based on performance measurements  An outside agent can monitor and select an optimal working set of components FunctionPort MidpointIntegrator IntegratorPort FunctionPort MonteCarloIntegrator IntegratorPort RandomGeneratorPort IntegratorPort Driver GoPort NonlinearFunction FunctionPort LinearFunction FunctionPort RandomGenerator RandomGeneratorPort PiFunction FunctionPort

27 Pasadena CCA Meeting 27 Jan. 16, 2003 Component Optimizing Performance Results

28 Pasadena CCA Meeting 28 Jan. 16, 2003 Computational Facility for Reacting Flow Science  Sandia National Laboratory  DOE SciDAC project (http://cfrfs.ca.sandia.gov)http://cfrfs.ca.sandia.gov  Jaideep Ray  Component-based simulation and analysis  Sandia’s CCAFFEINE framework  Toolkit components for assembling flame simulation  integrator, spatial discretizations, chemical/transport models  structured adaptive mesh, load-balancers, error-estimators  in-core, off-machine, data transfers for post-processing  Components are C++ and wrapped F77 and C code  Kernel for 3D, adaptive mesh low Mach flame simulation

29 Pasadena CCA Meeting 29 Jan. 16, 2003 Simulation System Architecture  Three partitions: 15-proc, 3-proc, 1-proc  In-core, off-machine data transfer  MxN transfer component (CUMULVS, ORNL, Kohl) Driver Combustion Components MxN Driver Post- processing Subsystem MxN Disk I/O and thumbnail pictures Simulation (15 proc) Post-processing (3 proc) Thumbnails, I/O 15x3 3x1

30 Pasadena CCA Meeting 30 Jan. 16, 2003 Flame Reaction-Diffusion Demonstration CCAFFEINE

31 Pasadena CCA Meeting 31 Jan. 16, 2003 Meeting CCA Performance Engineering Goals?  Language interoperability?  SIDL and Babel give access to all supported languages  TAU supports multi-language instrumentation  Component interface instrumentation automated with PDT  Platform interoperability?  Implement observability component across platforms  TAU runs wherever CCA runs  Execution model transparent?  TAU measurement support for multiple execution models  Reuse with any CCA-compliant framework?  Demonstrated with SIDL/Babel, CCAFEINE, SCIRun

32 Pasadena CCA Meeting 32 Jan. 16, 2003 Meeting CCA Performance Engineering Goals?  Component performance knowledge?  Representation and performance repository work to do  Utilize effectively for deployment and steering  Build repository with TAU performance database  Performance of component compositions?  Component-to-component performance  Per connection instrumentation and measurement  Utilize performance mapping support  Ensemble-wide performance monitoring  connect performance “producers” to “consumers”  component-style implementation

33 Pasadena CCA Meeting 33 Jan. 16, 2003 Concluding Remarks  Complex component systems pose challenging performance analysis problems that require robust methodologies and tools  New performance problems will arise  Instrumentation and measurement  Data analysis and presentation  Diagnosis and tuning  Performance modeling  Performance engineered components  Performance knowledge, observation, query and control

34 Support Acknowledgement  TAU and PDT support:  Department of Energy (DOE)  DOE 2000 ACTS contract  DOE MICS contract  DOE ASCI Level 3 (LANL, LLNL)  U. of Utah DOE ASCI Level 1 subcontract  DARPA  NSF National Young Investigator (NYI) award


Download ppt "Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University."

Similar presentations


Ads by Google