A Case Study of HPC Metrics Collection and Analysis Philip Johnson and Michael Paulding, University of Hawaii, Honolulu, Hawaii. Goals of the case study.

Slides:



Advertisements
Similar presentations
Analyzing Parallel Performance Intel Software College Introduction to Parallel Programming – Part 6.
Advertisements

Instructional Technology vs. Educational Technology
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
Simple but slow: O(n 2 ) algorithms Serial Algorithm Algorithm compares each particle with every other particle and checks for the interaction radius Most.
SLA-Oriented Resource Provisioning for Cloud Computing
Intel® performance analyze tools Nikita Panov Idrisov Renat.
Timothy Blattner and Shujia Zhou May 18, This project is sponsored by Lockheed Martin We would like to thank Joseph Swartz, Sara Hritz, Michael.
4/26/05Han: ELEC72501 Department of Electrical and Computer Engineering Auburn University, AL K.Han Development of Parallel Distributed Computing System.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Performance Metrics Parallel Computing - Theory and Practice (2/e) Section 3.6 Michael J. Quinn mcGraw-Hill, Inc., 1994.
CS 584 Lecture 11 l Assignment? l Paper Schedule –10 Students –5 Days –Look at the schedule and me your preference. Quickly.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
High Performance Computing with cloud Xu Tong. About the topic Why HPC(high performance computing) used on cloud What’s the difference between cloud and.
Measuring zSeries System Performance Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012 Sponsored in part by Deer &
(1) Project LEAP: A “Personal Information Environment” for Software Engineers Philip Johnson Cam Moore Collaborative Software Development Laboratory University.
2a.1 Evaluating Parallel Programs Cluster Computing, UNC-Charlotte, B. Wilkinson.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.
Chapter 4 Performance. Times User CPU time – Time that the CPU is executing the program System CPU time – time the CPU is executing OS routines for the.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Chapter 6 : Software Metrics
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
1 Using Multiple Energy Gears in MPI Programs on a Power- Scalable Cluster Vincent W. Freeh, David K. Lowenthal, Feng Pan, and Nandani Kappiah Presented.
Pursuing Faster I/O in COSMO POMPA Workshop May 3rd 2010.
Chapter 1 Assuming the Role of the Systems Analyst Systems Analysis and Design Kendall & Kendall Sixth Edition.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec.
CDA 3101 Fall 2013 Introduction to Computer Organization Computer Performance 28 August 2013.
1/20 Optimization of Multi-level Checkpoint Model for Large Scale HPC Applications Sheng Di, Mohamed Slim Bouguerra, Leonardo Bautista-gomez, Franck Cappello.
Performance Measurement. A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors:
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
(1) A “Software ICU” for assessing and maintaining software project health Philip Johnson Collaborative Software Development Laboratory Information and.
CSC 7600 Lecture 28 : Final Exam Review Spring 2010 HIGH PERFORMANCE COMPUTING: MODELS, METHODS, & MEANS FINAL EXAM REVIEW Daniel Kogler, Chirag Dekate.
Database Replication in Tashkent CSEP 545 Transaction Processing Sameh Elnikety.
SOFTWARE METRICS. Software Process Revisited The Software Process has a common process framework containing: u framework activities - for all software.
Chapter 3: Software Project Management Metrics
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Parallel Programming with MPI and OpenMP
CSCI-455/552 Introduction to High Performance Computing Lecture 6.
Barriers to Industry HPC Use or “Blue Collar” HPC as a Solution Presented by Stan Ahalt OSC Executive Director Presented to HPC Users Conference July 13,
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University.
Performance Performance
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
4. Performance 4.1 Introduction 4.2 CPU Performance and Its Factors
Accelerating High Performance Cluster Computing Through the Reduction of File System Latency David Fellinger Chief Scientist, DDN Storage ©2015 Dartadirect.
MA/CS 471 Lecture 15, Fall 2002 Introduction to Graph Partitioning.
HPC at HCC Jun Wang Outline of Workshop2 Familiar with Linux file system Familiar with Shell environment Familiar with module command Familiar with queuing.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Organizations Are Embracing New Opportunities
Lecture 2: Performance Evaluation
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
SPIDAL Analytics Performance February 2017
GridBench: A Tool for Benchmarking Grids
Parallel Programming By J. H. Wang May 2, 2017.
Classical Waterfall Model
CRESCO Project: Salvatore Raia
Why Do We Measure? assess the status of an ongoing project
Performance Analysis, Tools and Optimization
CS2100 Computer Organisation
Predict Failures with Developer Networks and Social Network Analysis
CSE8380 Parallel and Distributed Processing Presentation
Computer Programming.
By Brandon, Ben, and Lee Parallel Computing.
Workshop on Empirical Methods for the Analysis of Algorithms
Software metrics.
Dr.P.Chitra,Professor Department of Computer Science and Engineering
Why Do We Measure? assess the status of an ongoing project
Chapter 2: Performance CS 447 Jason Bakos Fall 2001 CS 447.
Trace and Logs analysis
Presentation transcript:

A Case Study of HPC Metrics Collection and Analysis Philip Johnson and Michael Paulding, University of Hawaii, Honolulu, Hawaii. Goals of the case study Provide a complete implementation of one Purpose Based Benchmark problem definition, called Optimal Truss Design Implement the Optimal Truss Design system in C++ using MPI on a 240 node Linux cluster at the University of Hawaii Develop and evaluate automated support for HPC process and product measurement using Hackystat Assess the utility of the metrics for understanding HPC development Metrics Collected Size (Number of files, Total SLOC, “Parallel” SLOC containing an MPI directive, “Serial” SLOC not containing an MPI directive, Test code) Active Time (amount of time spent editing Optimal Truss Design files) Performance (wall clock time on 1, 2, 4, 8, 16, and 32 processors) Milestone Tests (indicates functional completeness) Command Line Invocations Results: Basic Process and Product Measures Total Source Lines of Code3320 LOC Total Test Lines of Code901 LOC Total MPI Lines of Code1032 LOC Total Days (Calendar time)1 year Total Days (with Development Activity)88 days Total Active Time152 Hours Total Distinct MPI Directives60 Directives Total Files56 Files Total Sequential Files (no MPI)51 Files Total Parallel Files (containing MPI)5 Files Execution Time126 sec. (1 processor) 66 sec. (2 processors) 33 sec. (4 processors) 27 sec. (8 processors) 39 sec. (16 processors) 43 sec. (32 processors) Results: Derived Process and Product Measures Derived MetricDefinitionValue Productivity Proxy(LOC / Active Time)22 LOC/hour Average Daily Active Time(Total Active Time / Total Days)1.73 hours/day Test Code Density Percentage(Total Test LOC / Total LOC)27% MPI Code Density Percentage(Total MPI LOC / Total LOC)31% MPI File Density Percentage(Total MPI Files / Total Files)9% MPI Directive Frequency Ratio(Total MPI Directives : Total MPI LOC)1 Directive : 17 LOC Speedup(Execution Time (1 Proc.) / Exec. Time (n processors) 1.0 (1 processor) 1.9 (2 processors) 3.7 (4 processors) 4.5 (8 processors) 3.2 (16 processors) 2.9 (32 processors) Results: Process and Product Telemetry Charts For More Information Understanding HPCS development through automated process and product measurement with Hackystat, Philip M. Johnson and Michael G. Paulding, Proceedings of the Second Workshop on Productivity and Performance in High-End Computing. Results: Daily Diary with CLI and Most Active File Insights and Lessons Learned Productivity (22 LOC/hour) and test code density (27%) seem in line with traditional software engineering metrics. Speedup data indicates almost linear speedup to 4 processors, then falls off sharply, indicating that current solution is not scalable. Parallel and serial LOC were equal at start of project, then most effort was devoted to serial code, with some final enhancements to parallel code at end of project. Performance data was not comparable over course of project (only final numbers available; no telemetry) Hackystat provides effective infrastructure for collection of process and product metrics. This case study provides useful baseline data to compare with future studies. Future research: Compare to OpenMP or JavaParty implementation. Gather metrics while improving scalability of system Compare metrics against other application types. Analyze CLI data for patterns, bottlenecks Thanks to our sponsors