Presentation is loading. Please wait.

Presentation is loading. Please wait.

Initial Design of a Test Suite for Automatic Performance Analysis Tools Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany.

Similar presentations


Presentation on theme: "Initial Design of a Test Suite for Automatic Performance Analysis Tools Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany."— Presentation transcript:

1 Initial Design of a Test Suite for Automatic Performance Analysis Tools Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany Jesper Larsson Träff NEC Europe Ltd. C&C Research Labs Germany Initial Design of a Test Suite for (Automatic) Performance Analysis Tools

2 © 2003 Forschungszentrum Jülich, NIC-ZAM [2] IST Working Group APART (since 1999) APART Automatic Performance Analysis: Resources and Tools Forum for scientists and vendors About 20 partners in Europe and the U.S. Current Automatic Performance Tools Projects Askalon Kappa-Pi KOJAK Paradyn Peridot

3 © 2003 Forschungszentrum Jülich, NIC-ZAM [3] (Full, Associated, and Former) Members European Research Centers and Universities U.S. Research Centers and Universities Vendors

4 © 2003 Forschungszentrum Jülich, NIC-ZAM [4] APART Terminologie Performance Property Aspect of performance behavior of an application –E.g., communication dominated by waiting time Specified as condition referring to performance data Quantified and normalized in terms of behavior-independent metric (severity) Performance Problem Performance property with “negative” implications Performance Bottleneck Performance Problem with highest severity

5 © 2003 Forschungszentrum Jülich, NIC-ZAM [5] Example: Performance Property “Message in Wrong Order” Location RECV A Time wait SEND B C RECV SEND

6 © 2003 Forschungszentrum Jülich, NIC-ZAM [6] The APART Test Suite (ATS) Users rely on correct working of tools  Tools need to be especially well tested  Systematic approach needed APART Test Suite Common project inside APART group –Every member needs this  minimize resources –Ensures re-usability –Will also allow evaluation / comparison of the different member projects Main focus: automatic performance analysis tools But also useful for “regular” performance tools –http://www.fz-juelich.de/apart/ats/

7 © 2003 Forschungszentrum Jülich, NIC-ZAM [7] Desired Functionality Tests to determine whether the semantics of the original program were not altered Tests to see whether the recorded performance data is correct Synthetic positive test cases for each known and defined performance property and combinations of them Negative test cases which have no known performance problem “Real world” size parallel applications and benchmarks  Can be partially based on existing validation suites  WWW  Probably needs to be tool specific  Collect available benchmarks and applications  WWW  Design and Implementation of a ATS Framework

8 © 2003 Forschungszentrum Jülich, NIC-ZAM [8] Validation Suites and Kernel Benchmarks (I) Validation MPI test / validation suites from Intel, IBM, ANL MPI Benchmarks PARKBENCH (PARallel Kernels and BENCHmarks) / PMB - Pallas MPI Benchmarks SKaMPI (Special Karlsruher MPI – Benchmark)

9 © 2003 Forschungszentrum Jülich, NIC-ZAM [9] Kernel Benchmarks (II) OpenMP Benchmarks EPCC OpenMP Microbenchmarks … research/openmpbench/openmp_index.html Hybrid Benchmarks The Los Alamos MicroBenchmarks Suite (LAMB) MPI and multi threading ( Pthreads and OpenMP) programming models based on SKaMPI and EPCC

10 © 2003 Forschungszentrum Jülich, NIC-ZAM [10] “Real World” Applications and Benchmarks The NAS Parallel Benchmarks (NPB) The ASCI Purple and Blue Benchmark Codes … asci/purple/benchmarks/limited/code_list.html … asci_benchmarks/asci/asci_code_list.html NCAR Benchmarks

11 © 2003 Forschungszentrum Jülich, NIC-ZAM [11] Current Design of ATS Framework df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() DISTRIBUTION do_work() WORK

12 © 2003 Forschungszentrum Jülich, NIC-ZAM [12] The Distribution Module Distribution specified by Distribution function Distribution parameters All distribution function have the same signature double distr_func (int me, int size, double sf, distr_t* dd) –me, size:member me of group of size size –sf:scaling factor –dd:distribution parameter descriptor returns value for me calculated based on me, size, and dd scaled by sf ATS provides set of predefined distribution functions Can easily extended if needed

13 © 2003 Forschungszentrum Jülich, NIC-ZAM [13] Predefined Distribution Functions low high block2 low high cyclic2 val same low high linear low high peak low med high block3 low med high cyclic3 n

14 © 2003 Forschungszentrum Jülich, NIC-ZAM [14] Current Design of ATS Framework df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() DISTRIBUTION do_work() WORKMPI PROPERTIESOpenMP PROPERTIES par_do_omp_work() OpenMP UTILS par_do_mpi_work() alloc_mpi_buf() free_mpi_buf() alloc_mpi_vbuf() free_mpi_vbuf() mpi_commpattern_sendrecv() mpi_commpattern_shift() MPI UTILS

15 © 2003 Forschungszentrum Jülich, NIC-ZAM [15] Example: MPI Property Function late_sender void par_do_mpi_work(distr_func_t df, distr_t* dd, MPI_Comm c) { int me, sz; MPI_Comm_rank(c, &me); MPI_Comm_size(c, &sz); do_work(df(me, sz, 1.0, dd)); } void late_sender(double bwork, double ework, int r, MPI_Comm c) { val2_distr_t dd; int i; mpi_buf_t* buf = alloc_mpi_buf(base_type, base_cnt); dd.low = bwork+ework; dd.high = bwork; for (i = 0; i

16 © 2003 Forschungszentrum Jülich, NIC-ZAM [16] Currently Implemented Performance Property Functions MPI Point-to-PoCommunication Performance Properties late_sender(basework, extrawork, rf, MPI_Comm); late_receiver(basework, extrawork, rf, MPI_Comm); MPI Collective Communication Performance Properties imbalance_at_mpi_barrier(distr_func, distr_param, rf, MPI_Comm); imbalance_at_mpi_alltoall(distr_func, distr_param, rf, MPI_Comm); late_broadcast(basework, rootextrawork, root, rf, MPI_Comm); late_scatter(basework, rootextrawork, root, rf, MPI_Comm); late_scatterv(basework, rootextrawork, root, rf, MPI_Comm); early_reduce(rootwork, baseextrawork, root, rf, MPI_Comm); early_gather(rootwork, baseextrawork, root, rf, MPI_Comm); early_gatherv(rootwork, baseextrawork, root, rf, MPI_Comm); OpenMP Performance Properties imbalance_in_parallel_region(distr_func, distr_param, rf); imbalance_at_barrier(distr_func, distr_param, rf); imbalance_in_loop(distr_func, distr_param, rf);

17 © 2003 Forschungszentrum Jülich, NIC-ZAM [17] Current Design of ATS Framework df_same() df_cyclic2() df_block2() df_linear() df_peak() df_cyclic3() df_block3() DISTRIBUTION do_work() WORKMPI PROPERTIESOpenMP PROPERTIES par_do_omp_work() OpenMP UTILS par_do_mpi_work() alloc_mpi_buf() free_mpi_buf() alloc_mpi_vbuf() free_mpi_vbuf() mpi_commpattern_sendrecv() mpi_commpattern_shift() MPI UTILSTEST PROGRAMS

18 © 2003 Forschungszentrum Jülich, NIC-ZAM [18] Performance Property Test Programs Single performance property testing Programs can be generated automatically from performance property function signature –Generator based on Program Database Toolkit (PDT) –http://www.cs.uoregon.edu/research/paracomp/pdtoolkit/ Property parameters become test program arguments More extensive tests through scripting languages or experiment management system (e.g., Zenturio) –http://www.par.univie.ac.at/project/zenturio/ Composite performance property testing Program containing multiple performance property functions Complexity only limited by imagination Currently: manually implemented

19 © 2003 Forschungszentrum Jülich, NIC-ZAM [19] Example: Single Performance Property Test Program #include "mpi_pattern.h" int main(int argc, char *argv[]) { distr_func_t df = atodf("b2:0.5:1.0"); distr_t *dd = atodd("b2:0.5:1.0"); int r = 1; MPI_Init(&argc, &argv); switch ( argc ) { case 3: r = atoi(argv[2]); case 2: df = atodf(argv[1]); dd = atodd(argv[1]); case 1: break; default: fprintf(stderr, "usage: %s \n", argv[0]); break; } imbalance_at_mpi_barrier(df, dd, r, MPI_COMM_WORLD); MPI_Finalize(); }

20 © 2003 Forschungszentrum Jülich, NIC-ZAM [20] Example: Single Performance Property Test Program imbalance_at_mpi_barrier b2:0.5:1.0 2 b2:0.1 :2.0 5 Problem: additional property “MPI Setup/Termination Overhead” also holds!

21 © 2003 Forschungszentrum Jülich, NIC-ZAM [21] Example: Collection of MPI Performance Properties

22 © 2003 Forschungszentrum Jülich, NIC-ZAM [22] Examples: Detail MPI Properties

23 © 2003 Forschungszentrum Jülich, NIC-ZAM [23] Example: MPI Properties in 2 Communicators

24 © 2003 Forschungszentrum Jülich, NIC-ZAM [24] EXPERT Analysis of MPI 2 Communicator Example

25 © 2003 Forschungszentrum Jülich, NIC-ZAM [25] Example: OpenMP Performance Property

26 © 2003 Forschungszentrum Jülich, NIC-ZAM [26] ATS: Status and Future Work Initial prototype available from APART website List of MPI, OpenMP, and hybrid validation and benchmark suites 1 st version of ATS framework including –C version of code –Single property test program generator Future Work More complete collection of validation and benchmark suites Real “real world” applications ATS Framework –Fortran version –More complete list of property functions for MPI, OpenMP, hybrid, and sequential performance properties –Documentation


Download ppt "Initial Design of a Test Suite for Automatic Performance Analysis Tools Bernd Mohr Forschungszentrum Jülich John von Neumann - Institut für Computing Germany."

Similar presentations


Ads by Google