Advanced TAU Commander

Slides:



Advertisements
Similar presentations
Profiling your application with Intel VTune at NERSC
Advertisements

Intel® performance analyze tools Nikita Panov Idrisov Renat.
Automated Instrumentation and Monitoring System (AIMS)
Using the Argo Cluster Paul Sexton CS 566 February 6, 2006.
Basic linux shell commands and Makefiles. Log on to engsoft.rutgers.edu Open SSH Secure Shell – Quick Connect Hostname: engsoft.rutgers.edu Username/password:
1 SEEM3460 Tutorial Unix Introduction. 2 Introduction What is Unix? An operation system (OS), similar to Windows, MacOS X Why learn Unix? Greatest Software.
1 Introduction to Tool chains. 2 Tool chain for the Sitara Family (but it is true for other ARM based devices as well) A tool chain is a collection of.
DKRZ Tutorial 2013, Hamburg1 Hands-on: NPB-MZ-MPI / BT VI-HPS Team.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
CH 6 Configuring Server Hardware and power options.
Trilinos 101: Getting Started with Trilinos November 7, :30-9:30 a.m. Mike Heroux Jim Willenbring.
SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering1 Score-P Hands-On CUDA: Jacobi example.
CCS APPS CODE COVERAGE. CCS APPS Code Coverage Definition: –The amount of code within a program that is exercised Uses: –Important for discovering code.
Lecture 8. Profiling - for Performance Analysis - Prof. Taeweon Suh Computer Science Education Korea University COM503 Parallel Computer Architecture &
Makefiles CISC/QCSE 810. BeamApp and Tests in C++ 5 source code files After any modification, changed source needs to be recompiled all object files need.
Makefiles. makefiles Problem: You are working on one part of a large programming project (e. g., MS Word).  It consists of hundreds of individual.cpp.
Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Alexandru Calotoiu German Research School for.
Instructor Notes GPU debugging is still immature, but being improved daily. You should definitely check to see the latest options available before giving.
VAMPIR. Visualization and Analysis of MPI Resources Commercial tool from PALLAS GmbH VAMPIRtrace - MPI profiling library VAMPIR - trace visualization.
How to run RSM on imtf4 As of 2010/8/2 by Kei Yoshimura (AORI)
Vim Editor and Unix Command gcc compiler Computer Networks.
DDT Debugging Techniques Carlos Rosales Scaling to Petascale 2010 July 7, 2010.
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon
11th VI-HPS Tuning Workshop, April 2013, MdS, Saclay1 Hands-on exercise: NPB-MZ-MPI / BT VI-HPS Team.
Overview of CrayPat and Apprentice 2 Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative.
NA-MIC National Alliance for Medical Image Computing Slicer Building and Deployment Steve Pieper, PhD.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
Performance Monitoring Tools on TCS Roberto Gomez and Raghu Reddy Pittsburgh Supercomputing Center David O’Neal National Center for Supercomputing Applications.
Application Profiling Using gprof. What is profiling? Allows you to learn:  where your program is spending its time  what functions called what other.
Introduction to UNIX Road Map: 1. UNIX Structure 2. Components of UNIX 3. Process Structure 4. Shell & Utility Programs 5. Using Files & Directories 6.
Partners’ Webinar 01/31/2013 Karol Jarkovsky Solution Architect Upgrading Kentico.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
Hands-on: NPB-MZ-MPI / BT VI-HPS Team. 10th VI-HPS Tuning Workshop, October 2012, Garching Local Installation VI-HPS tools accessible through the.
Introduction Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See
Renesas Technology America Inc. 1 M16C Seminars Lab 3 Creating Projects Using HEW4 14 March 2005 M16C Seminars Lab 3 Creating Projects Using HEW4 Last.
Debugging Ensemble Productions CAMTA Meeting 11 th November 2010 John Murray.
CSCS-USI Summer School (Lugano, 8-19 July 2013)1 Hands-on exercise: NPB-MZ-MPI / BT VI-HPS Team.
1 Running MPI on “Gridfarm” Bryan Carpenter February, 2005.
Build Tools 1. Building a program for a large project is usually managed by a build tool that controls the various steps involved. These steps may include:
How to configure, build and install Trilinos November 2, :30-9:30 a.m. Jim Willenbring.
Object Oriented Programming COP3330 / CGS5409.  Compiling with g++  Using Makefiles  Debugging.
SC‘13: Hands-on Practical Hybrid Parallel Application Performance Engineering Hands-on example code: NPB-MZ-MPI / BT (on Live-ISO/DVD) VI-HPS Team.
Debugging Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
1 COMP 3500 Introduction to Operating Systems Project 4 – Processes and System Calls Part 3: Adding System Calls to OS/161 Dr. Xiao Qin Auburn University.
CLHEP Infrastructure Improvements CHEP 2004 Lynn Garren, FNAL and Andreas Pfeiffer, CERN.
Linux CSE 1222 CSE1222: Lecture 1BThe Ohio State University1.
Navigating TAU Visual Display ParaProf and TAU Portal Mahin Mahmoodi Pittsburgh Supercomputing Center 2010.
Profiling OpenSHMEM with TAU Commander
Getting Eclipse for C/C++ Development
Introduction to the TAU Performance System®
TAU integration with Score-P
NVIDIA Profiler’s Guide
How to build 3D Slicer for Windows
CMPE 152: Compiler Design ANTLR 4 and C++
TAU, TAU Commander, and ParaTools, Inc. 8 September 2017 Baltimore, MD.
Paul Sexton CS 566 February 6, 2006
1. Open Visual Studio 2008.
ns-3 Waf build system ns-3 Annual Meeting June 2017
Getting Started: Developing Code with Cloud9
JCreator Settings Only
Setting up CDT Makefile project
CMPSC 60: Week 4 Discussion
Tutorial: The Programming Interface
© 2012 Elsevier, Inc. All rights reserved.
Parallel Computing Explained How to Parallelize a Code
Getting Eclipse for C/C++ Development
Quick Tutorial on MPICH for NIC-Cluster
Makefiles, GDB, Valgrind
Surviving MS DOS When lost, google it.
Presentation transcript:

Advanced TAU Commander ParaTools, Inc. 28 September 2017 Webex from Baltimore, MD

Build Systems and Launchers ParaTools, Inc. Build Systems and Launchers Copyright © ParaTools, Inc.

Autotools Initialize before running configure: tau initialize If the project is already initialized, be sure you don’t have an “expensive” experiment selected, e.g. tracing or profiling with lots of options. ./configure CC=“tau gcc” Recommend --disable-dependency-tracking to avoid problems with source-based instrumentation. No worries if only sampling. make && make install If you change your experiment you do not have to reconfigure, just recompile: make clean Copyright © ParaTools, Inc.

CMake This should work: If it doesn’t, use the wrapper scripts: cmake -DCMAKE_C_COMPILER=“tau gcc” If it doesn’t, use the wrapper scripts: export PATH=$PWD/.tau/bin/<target_name> cmake -DCMAKE_C_COMPILER=“tau_gcc” Wrapper scripts are automatically generated for all compilers supported by the target. Wrapper for <compiler> is “tau_<compiler>” E.g. tau_gcc, tau_mpicc, tau_oshcc, etc. Wrappers can be used for any build system that doesn’t like spaces in the compiler name. Copyright © ParaTools, Inc.

Running with custom launchers tau trial create \ --launcher mylauncher -np 4 -- \ ./a.out bar baz Use the --launcher flag to indicate the launcher command and arguments. Use “--” to mark the beginning of the application command line. tau mpirun -np 4 ./a.out 20 is shorthand for: tau trial create --launcher mpirun –np 4 -- ./a.out 20 Copyright © ParaTools, Inc.

Profiling Parallel Applications ParaTools, Inc. Profiling Parallel Applications Copyright © ParaTools, Inc.

Step 1: Initialize TAU Project $ cp -R /path/to/taucmdr-1.2.0/examples $HOME $ cd $HOME/examples/matmult_omp $ ls Makefile  matmult.f90 $ tau initialize --mpi --openmp Creates a new project configuration using defaults Project files exist in a directory named “.tau” Like git, all directories below the directory containing the “.tau” directory can access the project E.g. `tau dashboard` works in miniapp1/baseline WARNING: Don’t execute tau initialize in $HOME! (this bug is fixed in version 1.2.0.4) Copyright © ParaTools, Inc.

matmult_omp Dashboard Copyright © ParaTools, Inc.

Edit matmult_omp/Makefile Before 1 F90 = mpifort 2 FFLAGS = -O -g 3 LIBS= -fopenmp After   1 F90 = tau mpif90   2 FFLAGS = -O -g   3 LIBS= -fopenmp Copyright © ParaTools, Inc.

Build matmult_omp Copyright © ParaTools, Inc.

Run matmult_omp Copyright © ParaTools, Inc.

Node for each MPI process Copyright © ParaTools, Inc.

Open Windows | 3D Visualization Copyright © ParaTools, Inc.

Event-based Sampling Data from Rank 0 Copyright © ParaTools, Inc.

Event-based Sampling Data from Rank 1&2 Copyright © ParaTools, Inc.

Node for each MPI process But hey, where are the threads? Copyright © ParaTools, Inc.

Threads Not Instrumented by Default To keep overhead low, OpenMP directives are not instrumented by default Create a new measurement (or edit an existing measurement) to enable thread-level instrumentation. Copyright © ParaTools, Inc.

From `tau measurement edit –help` Copyright © ParaTools, Inc.

Rebuild to instrument OpenMP with OMPT $ tau measurement copy profile profile.ompt \ --openmp=ompt Copyright © ParaTools, Inc.

Tracing Parallel Applications ParaTools, Inc. Tracing Parallel Applications Copyright © ParaTools, Inc.

Measurement Approaches Profiling Tracing Shows how much time was spent in each routine Shows when events take place on a timeline Copyright © ParaTools, Inc.

Different Nodes, Different Timelines Copyright © ParaTools, Inc.

View Time Lost Waiting for Send or Receive Copyright © ParaTools, Inc.

Select the “trace” Measurement to Trace $ tau select trace $ tau mpirun -np 16 ./matmult Copyright © ParaTools, Inc.

`tau show` Displays the Trace in Vampir Copyright © ParaTools, Inc.

ParaTools, Inc. Profiling Heap Memory Copyright © ParaTools, Inc.

Measure Heap Memory Usage From `tau measurement edit –help` $ tau measurement edit sample --heap-usage $ tau select sample $ make clean $ make $ tau mpirun -np 4 ./matmult Copyright © ParaTools, Inc.

Open the Context Event Window to See Heap Memory Usage Right-click a node label to get this menu Copyright © ParaTools, Inc.

Heap Memory Usage on MPI Rank 1 Copyright © ParaTools, Inc.

Profiling CUDA / OpenCL ParaTools, Inc. Profiling CUDA / OpenCL Copyright © ParaTools, Inc.

`tau init --cuda` Copyright © ParaTools, Inc.

Run with `tau` as usual Copyright © ParaTools, Inc.

GPUs are shown as “Threads” Copyright © ParaTools, Inc.

Open the GPU “Thread” to see kernel time Copyright © ParaTools, Inc.

Non-GPU threads show CUDA calls Copyright © ParaTools, Inc.

Compiler-based Instrumentation Copyright © ParaTools, Inc.

OpenCL OpenCL is pretty much the same: tau init --opencl tau gcc *.c tau ./a.out Copyright © ParaTools, Inc.