The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.

Slides:



Advertisements
Similar presentations
.NET Technology. Introduction Overview of.NET What.NET means for Developers, Users and Businesses Two.NET Research Projects:.NET Generics AsmL.
Advertisements

Model Driven Generative Programming Reza Azimi February 6, 2003 ECE1770: Trends in Middleware Systems.
Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
Chapter 2 Operating System Overview Operating Systems: Internals and Design Principles, 6/E William Stallings.
Data Marshaling for Multi-Core Architectures M. Aater Suleman Onur Mutlu Jose A. Joao Khubaib Yale N. Patt.
Intel® performance analyze tools Nikita Panov Idrisov Renat.
Tools for applications improvement George Bosilca.
Multi-core Programming Thread Checker. 2 Topics What is Intel® Thread Checker? Detecting race conditions Thread Checker as threading assistant Some other.
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
Vertically Integrated Analysis and Transformation for Embedded Software John Regehr University of Utah.
Visual Debugging Tools for Concurrent Models of Computation Elaine Cheong 15 May 2002 EE290N: Advanced Topics in System Theory.
Using JetBench to Evaluate the Efficiency of Multiprocessor Support for Parallel Processing HaiTao Mei and Andy Wellings Department of Computer Science.
1 1 Profiling & Optimization David Geldreich (DREAM)
- Chaitanya Krishna Pappala Enterprise Architect- a tool for Business process modelling.
Project Proposal (Title + Abstract) Due Wednesday, September 4, 2013.
SEC(R) 2008 Intel® Concurrent Collections for C++ - a model for parallel programming Nikolay Kurtov Software and Services.
Prospector : A Toolchain To Help Parallel Programming Minjang Kim, Hyesoon Kim, HPArch Lab, and Chi-Keung Luk Intel This work will be also supported by.
Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
Advanced Operating Systems CIS 720 Lecture 1. Instructor Dr. Gurdip Singh – 234 Nichols Hall –
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
CCS APPS CODE COVERAGE. CCS APPS Code Coverage Definition: –The amount of code within a program that is exercised Uses: –Important for discovering code.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
University of Michigan Electrical Engineering and Computer Science 1 Dynamic Acceleration of Multithreaded Program Critical Paths in Near-Threshold Systems.
Software Performance Analysis Using CodeAnalyst for Windows Sherry Hurwitz SW Applications Manager SRD Advanced Micro Devices Lei.
Adventures in Mastering the Use of Performance Evaluation Tools Manuel Ríos Morales ICOM 5995 December 4, 2002.
Accelerating Precise Race Detection Using Commercially-Available Hardware Transactional Memory Support Serdar Tasiran Koc University, Istanbul, Turkey.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Software & the Concurrency Revolution by Sutter & Larus ACM Queue Magazine, Sept For CMPS Halverson 1.
Support for Debugging Automatically Parallelized Programs Robert Hood Gabriele Jost CSC/MRJ Technology Solutions NASA.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
John Mellor-Crummey Robert Fowler Nathan Tallent Gabriel Marin Department of Computer Science, Rice University Los Alamos Computer Science Institute HPCToolkit.
Copyright © 2002, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
About Me Microsoft MVP Intel Blogger TechEd Israel, TechEd Europe Expert C++ Book
16 October Reminder Types of Testing: Purpose  Functional testing  Usability testing  Conformance testing  Performance testing  Acceptance.
Colorama: Architectural Support for Data-Centric Synchronization Luis Ceze, Pablo Montesinos, Christoph von Praun, and Josep Torrellas, HPCA 2007 Shimin.
Profiling, Tracing, Debugging and Monitoring Frameworks Sathish Vadhiyar Courtesy: Dr. Shirley Moore (University of Tennessee)
Debugging and Profiling With some help from Software Carpentry resources.
A New Parallel Debugger for Franklin: DDT Katie Antypas User Services Group NERSC User Group Meeting September 17, 2007.
Group 3: Architectural Design for Enhancing Programmability Dean Tullsen, Josep Torrellas, Luis Ceze, Mark Hill, Onur Mutlu, Sampath Kannan, Sarita Adve,
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
BridgePoint Integration John Wolfe / Robert Day Accelerated Technology.
Allen D. Malony Department of Computer and Information Science TAU Performance Research Laboratory University of Oregon Discussion:
Understanding the Behavior of Java Programs Tarja Systa Software Systems Lab. Tampere Univ. Sookmyung Women’s Univ. PSLAB Choi, yoon jeong.
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
ICFEM 2002, Shanghai Reasoning about Hardware and Software Memory Models Abhik Roychoudhury School of Computing National University of Singapore.
Thread basics. A computer process Every time a program is executed a process is created It is managed via a data structure that keeps all things memory.
Department of Computer Science Get the Parallelism out of my Cloud Karu Sankaralingam and Remzi H. Arpaci-Dusseau University of Wisconsin-Madison
Survey of Tools to Support Safe Adaptation with Validation Alain Esteva-Ramirez School of Computing and Information Sciences Florida International University.
Shangkar Mayanglambam, Allen D. Malony, Matthew J. Sottile Computer and Information Science Department Performance.
1 How to do Multithreading First step: Sampling and Hotspot hunting Myongji University Sugwon Hong 1.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Symbolic Analysis of Concurrency Errors in OpenMP Programs Presented by : Steve Diersen Contributors: Hongyi Ma, Liqiang Wang, Chunhua Liao, Daniel Quinlen,
CEPBA-Tools experiences with MRNet and Dyninst Judit Gimenez, German Llort, Harald Servat
Gauss Students’ Views on Multicore Processors Group members: Yu Yang (presenter), Xiaofang Chen, Subodh Sharma, Sarvani Vakkalanka, Anh Vo, Michael DeLisi,
Tuning Threaded Code with Intel® Parallel Amplifier.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Hongbin Li 11/13/2014 A Debugger of Parallel Mutli- Agent Spatial Simulation.
3/5/2002e-business and Information Systems1 Introduction Computer System Hardware Software HW Kernel/OS API Application Programs SW.
Parallel Software Development with Intel Threading Analysis Tools
Tracing and Performance Analysis Tools for Heterogeneous Multicore System by Soon Thean Siew.
NVIDIA Profiler’s Guide
Linternals SysInternals for Linux
Intel® Parallel Studio and Advisor
Allen D. Malony Computer & Information Science Department
Introduction to Operating Systems
Presentation transcript:

The Path to Multi-core Tools Paul Petersen

Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing

Multi-coreToolsThePathTo 3 Motivation Look at the way parallel software is written –Threads & Locks This has not really changed for decades –The names and details change Ultimately the languages leave it up to the user to get right –This means the user will sometimes get it wrong

Multi-coreToolsThePathTo 4 Where Are We Now Breakpoint Debuggers –State inspection tools –MS Visual Studio –Totalview, Gdb, Idb Profilers –VTune™ Performance Analyzer Structural –Function or Loop Statistical –HW or SW based Runtime Analysis –Intel® Thread Checker –Intel® Thread Profiler Libraries –OpenMP –Threading Building Blocks Higher-level abstraction for multi-core codes Understandable by analysis tools

Multi-coreToolsThePathTo 5 Intel® Thread Checker Observes the interaction in a concurrent application through memory references and synchronization operations –Compiler or binary instrumentation –Execution driven simulation Detect incorrect threading api usage and asynchronous memory references

Multi-coreToolsThePathTo 6 Thread Checker - UI

Multi-coreToolsThePathTo 7 Intel® Thread Profiler Observes the interactions in a concurrent application through the synchronization operations –Compiler or binary instrumentation –Event trace generation and analysis Detects bottlenecks through critical path analysis –In a concurrent application not all computation is equally important

Multi-coreToolsThePathTo 8 Thread Profiler – Timeline View

Multi-coreToolsThePathTo 9 Tooltips provide object information Thread Profiler – Summary View Let’s filter and group this by object

Multi-coreToolsThePathTo 10 What Is Easy To Do Next Enhance our current tools –Additional serial analysis Detect opportunities for parallel execution –Improve efficiency Focus capabilities –Expand platform coverage Example - managed languages like C# or Java –Mining the data we have now Suggest which problems should be tackled first

Multi-coreToolsThePathTo 11 What Is Missing Performance Projections –What-If analysis is very hard to do accurately –You need a very details system model, and very accurate understanding of what will change by running on a different system –Changing the number of threads, can cause non-linear scaling problems by exposing a bottleneck that did not appear to be significant as smaller thread counts

Multi-coreToolsThePathTo 12 What Else Is Missing Defect Detection –Moving from asynchronous memory access detection To non-atomic object access detection. –Users typically assume that modifications to “objects” are atomic But they have a hard time describing what is the “object” at any point in time.

Multi-coreToolsThePathTo 13 And Everything Else… A tool is, among other things, a device that provides a mechanical or mental advantage in accomplishing a tasktool This talk has focused on a collection of analysis tools specifically designed to aid in understanding threaded applications

Multi-core software poses many challenges that sequential software does not face We have the first round of tools specifically designed for the problems faced by the way parallel software is written These tools are limited to mostly just observing what happened, and reporting interesting facts about the program. They have a hard time generalizing these observations Conclusion

Multi-coreToolsThePathTo Multi-coreToolsThePathTo Multi-coreToolsThePathTo Multi-coreToolsThePathTo