Vincent Keller, Ralf Gruber, EPFL Intelligent GRID Scheduling Service (ISS) K. Cristiano, A. Drotz, R.Gruber, V. Keller, P. Kunszt, P. Kuonen, S. Maffioletti,

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Distributed Systems CS
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Protocols and software for exploiting Myrinet clusters Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Communication Pattern Based Node Selection for Shared Networks
Dr. Gengbin Zheng and Ehsan Totoni Parallel Programming Laboratory University of Illinois at Urbana-Champaign April 18, 2011.
Claude TADONKI Mines ParisTech – LAL / CNRS / INP 2 P 3 University of Oujda (Morocco) – October 7, 2011 High Performance Computing Challenges and Trends.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
OPERATING SYSTEM OVERVIEW
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
Measuring Network Performance of Multi-Core Multi-Cluster (MCMCA) Norhazlina Hamid Supervisor: R J Walters and G B Wills PUBLIC.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
Parallel Communications and NUMA Control on the Teragrid’s New Sun Constellation System Lars Koesterke with Kent Milfeld and Karl W. Schulz AUS Presentation.
Operational computing environment at EARS Jure Jerman Meteorological Office Environmental Agency of Slovenia (EARS)
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Trace Generation to Simulate Large Scale Distributed Application Olivier Dalle, Emiio P. ManciniMar. 8th, 2012.
SOS71 Is a Grid cost-effective? Ralf Gruber, EPFL-SIC/FSTI-ISE-LIN, Lausanne.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
المحاضرة الاولى Operating Systems. The general objectives of this decision explain the concepts and the importance of operating systems and development.
Course Wrap-Up Miodrag Bolic CEG4136. What was covered Interconnection network topologies and performance Shared-memory architectures Message passing.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
SALSA HPC Group School of Informatics and Computing Indiana University.
HPC computing at CERN - use cases from the engineering and physics communities Michal HUSEJKO, Ioannis AGTZIDIS IT/PES/ES 1.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
Tests and tools for ENEA GRID Performance test: HPL (High Performance Linpack) Network monitoring A.Funel December 11, 2007.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
Leibniz Supercomputing Centre Garching/Munich Matthias Brehm HPC Group June 16.
George Tsouloupas University of Cyprus Task 2.3 GridBench ● 1 st Year Targets ● Background ● Prototype ● Problems and Issues ● What's Next.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Brent Gorda LBNL – SOS7 3/5/03 1 Planned Machines: BluePlanet SOS7 March 5, 2003 Brent Gorda Future Technologies Group Lawrence Berkeley.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Introduction to Grid Computing and its components.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Università di Perugia Enabling Grids for E-sciencE Status of and requirements for Computational Chemistry NA4 – SA1 Meeting – 6 th April.
Ensieea Rizwani An energy-efficient management mechanism for large-scale server clusters By: Zhenghua Xue, Dong, Ma, Fan, Mei 1.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
LACSI 2002, slide 1 Performance Prediction for Simple CPU and Network Sharing Shreenivasa Venkataramaiah Jaspal Subhlok University of Houston LACSI Symposium.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
1
1.3 Operating system services An operating system provide services to programs and to the users of the program. It provides an environment for the execution.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Andrea Acquaviva, Luca Benini, Bruno Riccò
Slurm User Group Meeting’16 - Athens, Greece – Sept , 2016
Multi-Processing in High Performance Computer Architecture:
NGS computation services: APIs and Parallel Jobs
Complexity Measures for Parallel Computation
Architectural Interactions in High Performance Clusters
CSE8380 Parallel and Distributed Processing Presentation
CARLA Buenos Aires, Argentina - Sept , 2017
A Domain Decomposition Parallel Implementation of an Elasto-viscoplasticCoupled elasto-plastic Fast Fourier Transform Micromechanical Solver with Spectral.
Hybrid Programming with OpenMP and MPI
Processor Scheduling Hank Levy 1.
Complexity Measures for Parallel Computation
Presentation transcript:

Vincent Keller, Ralf Gruber, EPFL Intelligent GRID Scheduling Service (ISS) K. Cristiano, A. Drotz, R.Gruber, V. Keller, P. Kunszt, P. Kuonen, S. Maffioletti, P. Manneback, M.-C. Sawley, U. Schwiegelshohn, M. Thiémard, A. Tolou, T.-M. Tran, O. Wäldrich, P. Wieder, C. Witzig, R. Yahyapour, W. Ziegler, “Application-oriented scheduling for HPC Grids”, CoreGRID TR-0070 (2007) available on

Outline ISS Goals Applications & Resources characterization ISS architecture Decision model : CFM ISS Modules/Services Implementation Status Testbeds (HW & SW)

Goals of ISS 1. Find most suited computational resources in a HPC Grid for a given component 2. Use best an existing HPC Grid 3. Predict best evolution of an HPC Grid

Γ model : Characteristic parameters of an application task * O: Number of operations per node [Flops] W: Number of main memory accesses per node [Words] Z: Number of messages to be sent per node S: Number of words sent by one node [Words] V a =O/W:Number of operations per memory access [Flops/Word]  a = O/S: Number of operations per word sent [Flops/Word] * suppose the parallel subtasks are well equilibrated

Γ model : Characteristic parameters of a parallel machine P: Number of nodes in a machine R  : Peak performance of a node [Flops/s] M  : Peak main memory bandwidth of a node [Words/s] V M =R  / M  :Number of operations per memory access [Flops/Word] r a = min (R , M  * V a ): Peak task performance on a node [Flops/s] t c = O/r a :Minimum computation time [s] Note: r a = R  min (1, V a /V M )

Γ model : Characteristic parameters of the internode network C  :Total network bandwidth of a machine [Words/s] L: Latency of the network [s] :Average distance (= number of links passed) V c =P R  / C  : Number of operations per sent word [Flops/Word] b=C  /(P* ): Inter-node communication bandwidth per node [Words/s] t b =S/b:Time needed to send S words through the network [s] t L =LZ:Latency time [s] T=t c + t b + t L :Minimum turn around time of a task *  M =(r a /b)(1+t L /t b ): Number of operations per word sent [Flops/Word] B=b L:Message size taking L to be transfered * I/O is not considered and communication cannot be hidden behind computation

 model (One value per application and machine)  > 1 Speedup  =  a /  M Task/application:  a = O / S [flops/64bit word] Machine (if LZ/S<<1):  M = r a / b [flops/64bit word] Efficiency

Parameters of some Swiss HPC machines *  32 for half of C  **  10 *** decommissioned

Example: Speculoos Pleiades 2 GbE  =3.8 Pleiades 1 FE  =1.4 Pleiades 2+ GbE  =1.6

ISS/VIOLA environment

ISS : Job Execution Process Goal: Find most suited machines in a Grid to run application components

Cost Function Model

CPU Costs K e licence fees K l Results waiting time K w Energy Costs K eco Data Transfer Costs K d All the costs are expressed in Electronic Cost Unit (ECU)

Cost Function Model : CPU costs with investment cost, maintenance fees, bank interest, etc..

Cost Function Model : Broker The broker computes a list of machines with their relative costs for a given application component This ordered list is sent to the MSS for final decision and submission

Other important goal of ISS Simulation to evolve cluster resources in a Grid (uses the same simulator as to determine , ,  using statistical application execution data over a long period in time (same data as to determine , ,  Support tool to decide on how to choose new Grid resource

Side products VAMOS monitoring service (measurement of R a,  ) Application optimization (increase V a, R a ) Processor frequency adaptation (reduce energy consumption)

What exists? Simulator to determine , ,  VAMOS monitoring service to determine  Cost Function Model

What is in implementation phase? Interface between ISS and MSS (first version ready by end of June 07) R a monitoring (ready by end of Mai 07) Cost Function Model (beta version ready by end of 07) Simulator to predict new cluster acquisition (by the end of 07)

Application testbed CFD, MPI: SpecuLOOS (3D spectral element method) CFD, OpenMP: Helmholtz (3D solver with spectral elements) Plasma physics, single proc: VMEC (3D MHD equilibrium solver) Plasma physics, single proc: TERPSICHORE (3D ideal linear MHD stability analysis) Climate, POP-C++: Alpine3D (multiphysics, components) Chemistry : GAMESS (ab-initio molecular quantum chemistry)

First hardware testbed UNICORE/MSS/ISS GRID Pleiades 1 (132 single proc nodes, FE switch, OpenPBS/Maui) Pleiades 2 (120 single proc nodes, GbE switch, Torque/Maui) Pleiades 2+ (99 dual proc/dual core nodes, GbE switch, Torque/Maui) CONDOR pool EPFL (300 single & multi proc nodes, no interconnect network)

CSCS: SMP/vector Low  m cluster EPFL: SMP/NUMA High  m cluster ETHZ: SMP/NUMA High  m cluster EIF: NoW CERN: egee Grid SWING Switch ISSISS ISS as a SwissGrid metascheduler

Conclusions Automatic: Find best suited machines for a given application Monitor application behaviours on single node and network Guide towards: Better usage of overall GRID Extend existing GRID by best suited machines for an application set Single node optimization and better parallelization