A modeling approach for estimating execution time of long-running Scientific Applications Seyed Masoud Sadjadi 1, Shu Shimizu 2, Javier Figueroa 1,3, Raju.

Slides:



Advertisements
Similar presentations
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Advertisements

How Cyberinfrastructure is Helping Hurricane Mitigation Students Javier Delgado (FIU)‏ [presenter] Zhao Juan (CNIC)‏ [presenter] Bi Shuren (CNIC)‏ Silvio.
IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.
ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing P. Balaji, Argonne National Laboratory W. Feng and J. Archuleta, Virginia Tech.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Communication Pattern Based Node Selection for Shared Networks
OpenFOAM on a GPU-based Heterogeneous Cluster
SHARCNET. Multicomputer Systems r A multicomputer system comprises of a number of independent machines linked by an interconnection network. r Each computer.
A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
Building the communication performance model of heterogeneous clusters based on a switched network Alexey Lastovetsky
Justin Meza Qiang Wu Sanjeev Kumar Onur Mutlu Revisiting Memory Errors in Large-Scale Production Data Centers Analysis and Modeling of New Trends from.
P.Krusche / A. Tiskin - Efficient LLCS Computation using Bulk-Synchronous Parallelism Efficient Longest Common Subsequence Computation using Bulk-Synchronous.
1 Resolution of large symmetric eigenproblems on a world-wide grid Laurent Choy, Serge Petiton, Mitsuhisa Sato CNRS/LIFL HPCS Lab. University of Tsukuba.
MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.
Capacity Planning in SharePoint Capacity Planning Process of evaluating a technology … Deciding … Hardware … Variety of Ways Different Services.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Synergy.cs.vt.edu Power and Performance Characterization of Computational Kernels on the GPU Yang Jiao, Heshan Lin, Pavan Balaji (ANL), Wu-chun Feng.
Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
GPU Performance Prediction GreenLight Education & Outreach Summer Workshop UCSD. La Jolla, California. July 1 – 2, Javier Delgado Gabriel Gazolla.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Application Performance Prediction Javier Delgado Feb. 9, 2009 X.
The WRF Model The Weather Research and Forecasting (WRF) Model is a mesoscale numerical weather prediction system designed for both atmospheric research.
Transparent Grid Enablement Using Transparent Shaping and GRID superscalar I. Description and Motivation II. Background Information: Transparent Shaping.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Rio de Janeiro, October, 2005 SBAC Portable Checkpointing for BSP Applications on Grid Environments Raphael Y. de Camargo Fabio Kon Alfredo Goldman.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.
IPDPS 2005, slide 1 Automatic Construction and Evaluation of “Performance Skeletons” ( Predicting Performance in an Unpredictable World ) Sukhdeep Sodhi.
ARGONNE NATIONAL LABORATORY Climate Modeling on the Jazz Linux Cluster at ANL John Taylor Mathematics and Computer Science & Environmental Research Divisions.
Thermal-aware Issues in Computers IMPACT Lab. Part A Overview of Thermal-related Technologies.
Partnership for International Research and Education A Global Living Laboratory for Cyberinfrastructure Application Enablement Enhanced Grid Enabled Weather.
Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.
Presented by: Marlon Bright 19 June 2008 Advisor: Masoud Sadjadi, Ph.D. REU – Florida International University.
Presented by: Marlon Bright 1 August 2008 Advisor: Masoud Sadjadi, Ph.D. REU – Florida International University.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Active Sampling for Accelerated Learning of Performance Models Piyush Shivam, Shivnath Babu, Jeff Chase Duke University.
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Weather Research & Forecasting Model Xabriel J Collazo-Mojica Alex Orta Michael McFail Javier Figueroa.
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
Antonio Javier Cuenca Muñoz Dpto. Ingeniería y Tecnología de Computadores Processes Distribution of Homogeneous Parallel Linear Algebra Routines on Heterogeneous.
Design Issues of Prefetching Strategies for Heterogeneous Software DSM Author :Ssu-Hsuan Lu, Chien-Lung Chou, Kuang-Jui Wang, Hsiao-Hsi Wang, and Kuan-Ching.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Presented by: Marlon Bright 14 July 2008 Advisor: Masoud Sadjadi, Ph.D. REU – Florida International University.
Multi-objective Topology Synthesis and FPGA Prototyping Framework of Application Specific Network-on-Chip m Akram Ben Ahmed Xinyu LI, Omar Hammami.
WRF - REU Project Presentation Michael McFail Xabriel J Collazo-Mojica Javier Figueroa Alex Orta.
Simula Research Laboratory Lokaliteter & Forskning
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Finding Discriminating DNA Probe Sequences by Implementing a Parallelized Solution in a Cluster REU Camilo A. Silva Professor and Advisor: Dr. S. Masoud.
Weather Research and Forecasting (WRF) Portal Seychelles Martinez School of Computing and Information Sciences Florida International University Elias Rodriguez.
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
LACSI 2002, slide 1 Performance Prediction for Simple CPU and Network Sharing Shreenivasa Venkataramaiah Jaspal Subhlok University of Houston LACSI Symposium.
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Lecture 2: Performance Evaluation
Tohoku University, Japan
Recap: introduction to e-science
Constructing a system with multiple computers or processors
Weather Research and Forecasting (WRF) Portal
Department of Computer Science University of California, Santa Barbara
CLUSTER COMPUTING.
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Hybrid Programming with OpenMP and MPI
Presentation transcript:

A modeling approach for estimating execution time of long-running Scientific Applications Seyed Masoud Sadjadi 1, Shu Shimizu 2, Javier Figueroa 1,3, Raju Rangaswami 1, Javier Delgado 1, Hector Duran 4, Xabriel J. Collazo-Mojica 5 Presented by: Xabriel J. Collazo-Mojica 5 1: Florida International University (FIU), Miami, Florida, USA; 2: IBM Tokyo Research Laboratory, Tokyo, Japan; 3: University of Miami, Coral Gables, Florida, USA; 4: University of Guadalajara, CUCEA, Mexico; 5: University of Puerto Rico, Mayagüez Campus, Puerto Rico Miami, Florida – April 2008

Presentation Outline Motivation Research Approach Research Validation Related Work Concluding Remarks Future Research HPGC '08 - April 14 - LA Grid2

Motivation The impact of hurricanes is devastating The Weather Research and Forecasting (WRF) model Most popular It is computational and storage intensive We need higher resolution and more precise forecast Many organizations are willing to share resources But these resources are dynamic and unpredictable HPGC '08 - April 14 - LA Grid3

Motivation At the time of a hurricane, we need to act fast What resources should we allocate? We need to finish in a strict deadline (i.e. on time for hurricane forecast) In the order of seconds, we need to make a decision We need to model execution time of WRF based on target resources In our case: clusters with different parameters HPGC '08 - April 14 - LA Grid4

Approach to Modeling Resource Usage WRF HPGC '08 - April 14 - LA Grid5

Approach to Modeling Execution Parallelism Platform heterogeneity We assume identical individual resource characteristics of computation, communication and storage power. Execution scale We add a parameter to model the number of nodes utilized during execution. 123N … HPGC '08 - April 14 - LA Grid6

Application Resource Usage Model Characterize Applications according to their resource usage characteristics (i.e. application "profiles”) Assumptions: Execution time is based on contributors Product of contributors determines total execution time Computation nodes are homogeneous (e.g. Beowulf cluster) Non-ad-hoc application characteristics HPGC '08 - April 14 - LA Grid7

Application Resource Usage Model - Contributors Model aims to allow as many contributors as necessary This paper focus: 2 contributors First contributor: Parallelism P para = degree of parallelism α 0 = constant contribution α 1 = variable contribution Second contributor: CPU Performance P clock = clock speed of compute node ß 0 = constant contribution related to CPU performance ß 1 = variable contribution related to CPU performance HPGC '08 - April 14 - LA Grid8

Experimental Approach - Environment GCB cluster: Rocks ver. 4.0, 8 nodes, each containing 32-bit x86 Intel 3.0 GHz processors, 1GB of main memory and uses a gigabit network connection Mind cluster: Rocks ver. 4.0, 16 nodes, each containing dual Xeon 3.6GHz processors, 2GB of main memory and uses gigabit network connection CPU vs. #-of-NODES:100% to 10% CPU percentages with intervals of 10% We use CPULimit HPGC '08 - April 14 - LA Grid9

Experimental Approach - Monitoring and Prediction Two tools were used Amon – A Monitoring Tool Daemon-like application that collects and reports exploratory variables Aprof – A Profiling Tool Statistical Prediction Program Listens to Amon reports from compute nodes Stores collected data as matrix for each application HPGC '08 - April 14 - LA Grid10

Experimental Approach - Monitoring and Prediction HPGC '08 - April 14 - LA Grid11

Application Resource Usage Model - Validation Intuitive Assumption that execution time decreases linearly with the inverse of total computational power. Predictions within a cluster (i.e. GCB to GCB) GCB - FE 5.34% ME 5.86% Mind - FE 5.66% ME 3.80% Predictions across clusters GCB to Mind - FE 9.97% ME 5.86% Mind to GCB - FE 5.83% ME 4.13% This results validate our simple model. HPGC '08 - April 14 - LA Grid12

Application Resource Usage Model - Mind to GCB prediction HPGC '08 - April 14 - LA Grid13

Concluding Remarks We've proposed a new approach for modeling resource usage and execution time of a distributed application Experimental results using WRF execution on two different clusters show good accuracy - within 10% from across cluster predictions Using only two parameters - CPU speed and number of nodes. WRF specific, we are one step closer to devising a complete solution for our goal of higher-resolution weather predictions and simulations. HPGC '08 - April 14 - LA Grid14

Related Work S. Shimizu, R. Rangaswami, and H. A. Duran-Limon. "Platform-independent Modeling and Prediction of Application Resource Usage Characteristics” Basis for prediction model It is limited to one node D. M. Swany and R. Wolski. “Multivariate Resource Performance Forecasting In the Network Weather Service.” High-accuracy prediction model They emphasize latency and bandwidth HPGC '08 - April 14 - LA Grid15

Related Work R. Badia, F. Escale, E. Gabriel, J. Gimenez, R. Keller, J. Labarta, M. S. Müller, Perf. “Prediction in a Grid Environment.” Offline prediction Need to link their library to the application to be profiled HPGC '08 - April 14 - LA Grid16

Future Research Extend our parallelism model to address heterogeneous resources. Include more resource parameters to the model Started joint research with Barcelona Supercomputing Center We acknowledge that Amon & Aprof have limitations We will integrate our tools with their simulation application - DIMEMAS HPGC '08 - April 14 - LA Grid17

Acknowledgements National Science Foundation –REU Grant # IIS –PIRE Grant # OISE –CREST Grant # HRD –GCB Grant # OCI IBM Research LA Grid FIU SCIS HPGC '08 - April 14 - LA Grid18

Questions?