Grid Adventures on DAS, GridLab and Grid'5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.

Slides:



Advertisements
Similar presentations
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Scalability.
Advertisements

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Experiences.
The First 16 Years of the Distributed ASCI Supercomputer Henri Bal Vrije Universiteit Amsterdam COMMIT/
Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences DAS-1 DAS-2 DAS-3.
Opening Workshop DAS-2 (Distributed ASCI Supercomputer 2) Project vrije Universiteit.
Vrije Universiteit Interdroid: a platform for distributed smartphone applications Henri Bal, Nick Palmer, Roelof Kemp, Thilo Kielmann High Performance.
CCGrid2013 Panel on Clouds Henri Bal Vrije Universiteit Amsterdam.
7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
The Ibis model as a paradigm for programming distributed systems Henri Bal Vrije Universiteit Amsterdam (from Grids and Clouds to Smartphones)
Parallel programming in Java. Java has 2 forms of support for parallel programming: –Multithreading Multiple threads of control (sub processes), useful.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
More information Examination (date: 19 december) Written exam based on: - Reader (without paper 12 by Marsland and Campbell) - Awari paper (available from.
Distributed supercomputing on DAS, GridLab, and Grid’5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Virtual Laboratory for e-Science (VL-e) Henri Bal Department of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.
Virtual Laboratory for e-Science (VL-e) Henri Bal Department of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.
Henri Bal Vrije Universiteit Amsterdam vrije Universiteit.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
June 3, 2015 Synthetic Grid Workloads with Ibis, K OALA, and GrenchMark CoreGRID Integration Workshop, Pisa A. Iosup, D.H.J. Epema Jason Maassen, Rob van.
Real-World Distributed Computing with Ibis Henri Bal Vrije Universiteit Amsterdam.
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema, Hashim Mohamed,Mathieu Jan, Ozan Sonmez 3 rd Grid Initiative Summer School,
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal, Jason Maassen, Rob van Nieuwpoort, Thilo Kielmann, Niels Drost, Ceriel Jacobs, Frank.
Virtual Laboratory for e-Science (VL-e) Henri Bal Department of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.
Ibis: a Java-centric Programming Environment for Computational Grids Henri Bal Vrije Universiteit Amsterdam vrije Universiteit.
4 december, DAS3-G5K Interconnection Workshop Hosted by the VU (Thilo Kielmann), Amsterdam Dick Epema (TUD) and Franck Cappello (INRIA) Parallel.
Parallel programming in Java. Java has 2 forms of support for parallel programming: –Multithreading Multiple threads of control (sub processes), useful.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
4 december, The Distributed ASCI Supercomputer The third generation Dick Epema (TUD) (with many slides from Henri Bal) Parallel and Distributed.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Going Dutch: How to Share a Dedicated Distributed Infrastructure for Computer Science Research Henri Bal Vrije Universiteit Amsterdam.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters.
DAS 1-4: 14 years of experience with the Distributed ASCI Supercomputer Henri Bal Vrije Universiteit Amsterdam.
This work was carried out in the context of the Virtual Laboratory for e-Science project. This project is supported by a BSIK grant from the Dutch Ministry.
Panel Abstractions for Large-Scale Distributed Systems Henri Bal Vrije Universiteit Amsterdam.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
1 Challenge the future KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Presenter: Lipu Fei Authors: Lipu Fei, Bogdan.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
Dutch Tier Hardware Farm size –now: 150 dual nodes + scavenging 200 nodes –buildup to ~1500 up-to-date nodes in 2007 Network –now: 2 Gbit/s internatl.
A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.
More on Adaptivity in Grids Sathish S. Vadhiyar Source/Credits: Figures from the referenced papers.
ICT infrastructure for Science: e-Science developments Henri Bal Vrije Universiteit Amsterdam.
Key prototype applications Grid Computing Grid computing is increasingly perceived as the main enabling technology for facilitating multi-institutional.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Parallel Computing on Wide-Area Clusters: the Albatross Project Aske Plaat Thilo Kielmann Jason Maassen Rob van Nieuwpoort Ronald Veldema Vrije Universiteit.
Primitive Concepts of Distributed Systems Chapter 1.
SCARIe: using StarPlane and DAS-3 Paola Grosso Damien Marchel Cees de Laat SNE group - UvA.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
DutchGrid KNMI KUN Delft Leiden VU ASTRON WCW Utrecht Telin Amsterdam Many organizations in the Netherlands are very active in Grid usage and development,
High level programming for the Grid Gosia Wrzesinska Dept. of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.
AMOEBA study of distributed system
Fault tolerance, malleability and migration for divide-and-conquer applications on the Grid Gosia Wrzesińska, Rob V. van Nieuwpoort, Jason Maassen, Henri.
Real-World Distributed Computing with Ibis
Henri Bal Vrije Universiteit Amsterdam
Parallel programming in Java
Vrije Universiteit Amsterdam
Cluster Computers.
Presentation transcript:

Grid Adventures on DAS, GridLab and Grid'5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences

Background (1) Distributed ASCI Supercomputer (DAS) -Joint distributed infrastructure of ASCI research school Long history and continuity -DAS-1 (1997), DAS-2 (2002), DAS-3 (Summer 2006) Simple Computer Science grid that works -Homogeneous system -Stimulated new lines of CS research

Background (2) DAS is ideally suited for clear, laboratory- like (distributed) experiments -Projects: Albatross, MagPIe, Manta, Awari, …. -Over 200 users, 25 Ph.D. theses Less suited for large-scale, heterogeneous grid experiments -Use GridLab testbed and Grid’5000

DAS-2 (2002-now) VU (72)Amsterdam (32) Leiden (32) Delft (32) SURFnet 1 Gb/s Utrecht (32) Configuration two 1 GHz Pentium-3s >= 1 GB memory GB disk Myrinet interconnect Redhat Enterprise Linux Globus 3.2 PBS => Sun Grid Engine

Outline Example research projects that need grids - Ibis: Java-centric grid middleware - Satin: Divide & conquer parallelism on a grid - Zorilla: P2P supercomputing Grid experiments on DAS-2, GridLab, Grid’5000 Our colorful future: DAS-3

The Ibis system Programming support for distributed supercomputing on heterogeneous grids -Run non-trivially parallel applications on a grid -Fast RMI, group communication, object replication, d&c Use Java-centric approach + JVM technology -Inherently more portable than native compilation -Requires entire system to be written in pure Java -Optimized special-case solutions with native code -Optimizations using bytecode rewriting Several external users: -ProActive, VU medical center, University of Amsterdam, AMOLF, TU Darmstadt

Ibis Overview IPL RMISatin RepMI GMI MPJ ProActive TCPUDP P2P GMPanda Application Java Native Legend :

Satin: a parallel divide-and- conquer system on top of Ibis Divide-and-conquer is inherently hierarchical More general than master/worker Satin: Cilk-like primitives (spawn/sync) in Java Supports replicated shared objects with user- defined coherence semantics Supports malleability (nodes joining/leaving) and fault-tolerance (nodes crashing)

Zorilla: P2P supercomputing Fully distributed Java-based system for running parallel applications Uses P2P techniques Supports malleable (Satin) applications Uses locality-aware flood-scheduling algorithm

Grid experiments DAS is ``ideal’’, laboratory-like environment for doing clean performance experiments GridLab testbed (incl. VU-DAS) is used for doing heterogeneous experiments Grid’5000 is used for large-scale experiments

Performance Ibis on wide-area DAS-2 (64 nodes) Cellular Automaton uses Ibis/IPL, the others use Satin.

GridLab testbed

Testbed sites

Experiences Grid testbeds are difficult to obtain Poor support for co-allocation Firewall problems everywhere Java indeed runs anywhere Divide-and-conquer parallelism can obtain high efficiencies (66-81%) on a grid -See [van Reeuwijk, Euro-Par 2005]

GridLab results ProgramsitesCPUsEfficiency Raytracer (Satin)54081 % SAT-solver (Satin)52888 % Compression (Satin)32267 % Cellular autom. (IPL)32266 % Efficiency normalized to single CPU type (1GHz P3)

French Grid’5000 system ``5000 CPUs distributed in 9 sites for research in Grid Computing, eScience and Cyber-infrastructures ‘’ -(Actually 2000) See Franck’s talk

Grid’5000 experiments Used Grid’5000 for -Nqueens challenge (2 nd Grid Plugtest) -Testing Satin’s shared objects -Large-scale Zorilla experiments Experiences -No DNS-resolution for compute nodes -Using local IP addresses ( x.y) for routing -Setting up connections to nodes with multiple IP addresses -Unable to run on Grid’5000 and DAS-2 simultaneously

Grid’5000 results ProgramsitesCPUsEfficiency SAT solver % Traveling Salesman % VLSI routing % N-queens5960(~ 85 %) Satin programs (using shared objects) Running concurrently on clusters at Sophia-Antipolis, Bordeaux, Rennes, and Orsay

Zorilla on Grid’5000 N-Queens divide-and-conquer Java application Six Grid’5000 clusters 338 machines 676 processors 22-queens in 35 minutes

Processor allocation results

Comparison DAS +Ideal for speedup measurements (homogeneous) -Bad for heterogeneous or long-latency experiments GridLab testbed +Useful for heterogeneous experiments (e.g. Ibis) -Small-scale, unstable Grid’5000 +Excellent for large-scale experiments -Bad connectivity to outside world (DAS)

DAS-3DAS-3 CPU’s R R R R R NOC DWDM backplane -Dedicated optical group of lambdas -Can allocate multiple 10 Gbit/s lambdas between sites

Summary DAS is a shared infrastructure for experimental computer science research It allows controlled (laboratory-like) grid experiments We want to use DAS as part of larger international grid experiments (e.g. with Grid’5000)

Acknowledgements Grid’5000 Andy Tanenbaum Bob Hertzberger Henk Sips Lex Wolters Dick Epema Cees de Laat Aad van der Steen More info: Rob van Nieuwpoort Jason Maassen Kees Verstoep Gosia Wrzesinska Niels Drost Thilo Kielmann Ceriel Jacobs Kees van Reeuwijk Many others