Presentation is loading. Please wait.

Presentation is loading. Please wait.

- 1 - - Lisbon, August 29-30 2005A. Streit DEISA Forschungszentrum Jülich in der Helmholtz-Gesellschaft Achim Streit

Similar presentations


Presentation on theme: "- 1 - - Lisbon, August 29-30 2005A. Streit DEISA Forschungszentrum Jülich in der Helmholtz-Gesellschaft Achim Streit"— Presentation transcript:

1 - 1 - GRID@Large - Lisbon, August 29-30 2005A. Streit DEISA Forschungszentrum Jülich in der Helmholtz-Gesellschaft Achim Streit www.deisa.org

2 - 2 - GRID@Large - Lisbon, August 29-30 2005A. Streit Agenda Introduction SA3: Resource Management DEISA Extreme Computing Initiative Conclusion

3 - 3 - GRID@Large - Lisbon, August 29-30 2005A. Streit The DEISA Consortium DEISA is a consortium of leading national supercomputer centers in Europe IDRIS – CNRS, France FZJ, Jülich, Germany RZG, Garching, Germany CINECA, Bologna, Italy EPCC, Edinburgh, UK CSC, Helsinki, Finland SARA, Amsterdam, The Netherlands HLRS, Stuttgart, Germany BSC, Barcelona, Spain LRZ, Munich, Germany ECMWF (European Organization), Reading, UK Granted by: European Union FP6 Grant period: May, 1st 2004 – April, 30th 2008

4 - 4 - GRID@Large - Lisbon, August 29-30 2005A. Streit DEISA objectives To enable Europe’s terascale science by the integration of Europe’s most powerful supercomputing systems. Enabling scientific discovery across a broad spectrum of science and technology is the only criterion for success DEISA is an European Supercomputing Service built on top of existing national services. DEISA deploys and operates a persistent, production quality, distributed, heterogeneous supercomputing environment with continental scope.

5 - 5 - GRID@Large - Lisbon, August 29-30 2005A. Streit Basic requirements and strategies for the DEISA research Infrastructure Fast deployment of a persistent, production quality, grid empowered supercomputing infrastructure with continental scope. European supercomputing service built on top of existing national services requires reliability and non disruptive behavior. User and application transparency Top-down approach: technology choices result from the business and operational models of our virtual organization. DEISA technology choices are fully open.

6 - 6 - GRID@Large - Lisbon, August 29-30 2005A. Streit The DEISA supercomputing Grid: A layered infrastructure Inner layer: a distributed super-cluster resulting from the deep integration of similar IBM AIX platforms at IDRIS, FZ-Jülich, RZG- Garching and CINECA (phase 1) then CSC (phase 2). It looks to external users as a single supercomputing platform. Outer layer: a heterogeneous supercomputing Grid: IBM AIX super-cluster (IDRIS, FZJ, RZG, CINECA, CSC) close to 24 Tf BSC, IBM PowerPC Linux system, 40 Tf LRZ, Linux cluster (2.7 Tf) moving to SGI ALTIX system (33 Tf in 2006, 70 Tf in 2007) SARA, SGI ALTIX Linux cluster, 2.2 Tf ECMWF, IBM AIX system, 32 Tf HLRS, NEC SX8 vector system, close to 10 Tf

7 - 7 - GRID@Large - Lisbon, August 29-30 2005A. Streit Logical view of the phase 2 DEISA network DFN RENATER GARR GÈANT SURFnet UKERNA RedIRIS FUnet

8 - 8 - GRID@Large - Lisbon, August 29-30 2005A. Streit AIX Super-Cluster May 2005 CSC ECMWF Services Services: High performance datagrid via GPFS Access to remote files use the full available network bandwidth Job migration across sites Used to load balance the global workflow when a huge partition is allocated to a DEISA project in one site Common Production Environment

9 - 9 - GRID@Large - Lisbon, August 29-30 2005A. Streit Service Activities SA1 – Network Operation and Support (FZJ) Deployment and operation of a gigabit per second network infrastructure for an European distributed supercomputing platform. Network operation and optimization during project activity. SA2 – Data Management with Global File Systems (RZG) Deployment and operation of global distributed file systems, as basic building blocks of the “inner” super-cluster, and as a way of implementing global data management in a heterogeneous Grid. SA3 – Resource Management (CINECA) Deployment and operation of global scheduling services for the European super-cluster, as well as for its heterogeneous Grid extension. SA4 – Applications and User Support (IDRIS) Enabling the adoption by the scientific community of the distributed supercomputing infrastructure, as an efficient instrument for the production of leading computational science. SA5 – Security (SARA) Providing administration, authorization and authentication for a heterogeneous cluster of HPC systems, with special emphasis on single sign-on.

10 - 10 - GRID@Large - Lisbon, August 29-30 2005A. Streit SA3: A Three Layer Architecture Basic services located closest to the operating system of the computing platforms enable the operation of a single or a multiple cluster through local or extended batch schedulers and other cluster-like features Intermediate services first-level Grid services that allow access to an enlarged Grid- empowered infrastructure dealing with resource and network monitoring and information systems Advanced service use the previous layers to implement the global management of the distributed resources of the infrastructure

11 - 11 - GRID@Large - Lisbon, August 29-30 2005A. Streit Logical Layout Hardware OS and communication Resource manager Policies implementation through the scheduler (workload,advance reservation, accounting) Services: access workflow management co-allocation brokering job rerouting multiple accounting data staging

12 - 12 - GRID@Large - Lisbon, August 29-30 2005A. Streit Gateway 4.1.0 NJS 4.2.0 TSI 4.1.0 J2SE 1.4.2 UNICORE Infrastructure

13 - 13 - GRID@Large - Lisbon, August 29-30 2005A. Streit Physical Layout Resource Management Power 4 AIX 5.2 LL RM LL backfill Power 4 AIX 5.2 LL RM LL backfill Power 4 AIX 5.2 LL RM LL backfill Power 4 AIX 5.2 LL RM LL backfill Power 4 AIX 5.2 LL RM LL backfill Power 4 AIX 5.2 LL RM LL backfill IA64 RHEL+ SGI PP LSF RM LSF HPC IA64 RHEL+ SGI PP SGE RM SGE PPC SUSE LL RM LL backfill SX NEC OS NEC NQE RM NEC NQE IDRISFZJRZGCINECACSCECMWFSARALRZBSCHLRS UNICORE

14 - 14 - GRID@Large - Lisbon, August 29-30 2005A. Streit Physical Layout Data Management Power 4 AIX 5.2 Power 4 AIX 5.2 Power 4 AIX 5.2 Power 4 AIX 5.2 Power 4 AIX 5.2 Power 4 AIX 5.2 IA64 RHEL+ SGI PP Client Ad Hoc IA64 RHEL+ SGI PP PPC SUSE SX NEC OS IDRISFZJRZGCINECACSCECMWFSARALRZBSCHLRS IBM GPFS (General Parallel File System) over WAN Client Ad Hoc Client Ad Hoc ?? Client Native Client Native Client Native Client Native Client Native Client Native Client Native

15 - 15 - GRID@Large - Lisbon, August 29-30 2005A. Streit DEISA Supercomputing Grid services Workflow management: based on UNICORE plus further extensions and services coming from DEISA’s JRA7 and other projects (UniGrids, …) Global data management: a well defined architecture implementing extended global file systems on heterogeneous systems, fast data transfers across sites, and hierarchical data management at a continental scale. Co-scheduling: needed to support Grid applications running on the heterogeneous environment. Science Gateways and portals: specific Internet interfaces to hide complex supercomputing environments from end users, and facilitate the access of new, non traditional, scientific communities.

16 - 16 - GRID@Large - Lisbon, August 29-30 2005A. Streit CPU GPFS CPU GPFS CPU GPFS CPU GPFS CPU GPFS + NRENs Client Job Data Job-workflow: 1)FZJ 2)CINECA 3)RZG 4)IDRIS 5)SARA Workflow Application with UNICORE Global Data Management with GPFS

17 - 17 - GRID@Large - Lisbon, August 29-30 2005A. Streit Resource Management Information System (RMIS) Deliver up to date and complete resource management information about the grid Provide relevant information to system administrators from remote sites and to end-users Our approach performed a implementation-independent system analysis attempted to model the DEISA distributed supercomputer platform designed to operate the grid identified the resource management part as a sub-system needing to interface other sub-systems to get relevant information other sub-systems use external tools (monitoring tools, data bases and batch system) with which we need to interface

18 - 18 - GRID@Large - Lisbon, August 29-30 2005A. Streit Implementation Cluster Server Batch Scheduler Ganglia gmond MDS2 : static and almost static data Ganglia gmond Firewall RMIS web front-end Web Server Configuration files Ganglia gmetad MDS2 back-end Based on Ganglia monitoring tool coupled with MDS2/Globus The data published have been distinguished in two groups : static data (MDS2) – refresh time ~ hours or days dynamic data (Ganglia) – refresh time ~ seconds or minutes Web server based on the Ganglia web front end allows the display of any relevant data from MDS2 or Ganglia

19 - 19 - GRID@Large - Lisbon, August 29-30 2005A. Streit Portals (Science Gateways) Same concept as TeraGrid’s Science Gateways Needed to enhance the outreach of supercomputing infrastructures Hiding complex supercomputing environments from end users, providing discipline specific tools and support, and moving in some cases towards community allocations. There is already work done by DEISA on Genomics and Material Sciences portals Intense brainstorming on the desing of a global strategy, if possible interoperable with TeraGrid’s Science Gateways

20 - 20 - GRID@Large - Lisbon, August 29-30 2005A. Streit Enabling science Initial, “early users” program: a number of Joint Research Activities integrated in the project from the start. Moving towards “exceptional users”: the DEISA Extreme Computing Initiative ActivityScientific programPartnersLeader JRA1Enabling Material Science, CPMD cods, portals RZGHermann Lederer, RZG JRA2Computational environment for applications in Cosmology EPCCGavin Pringle, EPCC JRA3Enabling the TORB Plasma Physics code RZGHermann Lederer, RZG JRA4Life science: genomic and eHealth Applications IDRIS, (BSC)Victor Alessanrini, IDRIS  BSC JRA5CFD in automobile industryCINECA, CRIRoberto Tregnago, CRI JRA6Coupled applications: Astrophysics, Combustion, Environment IDRIS (HLRS)Gilles Grasseau, IDRIS

21 - 21 - GRID@Large - Lisbon, August 29-30 2005A. Streit The Extreme Computing Initiative Identification, deployment and operation of a number of “flagship” applications in selected areas of science and technology Applications must rely on the DEISA Supercomputing Grid services (application profiles have been clearly defined). They will benefit from exceptional resources from the DEISA pool. Applications are selected on the basis of scientific excellence, innovation potential, and relevance criteria. European call for proposals: April 1st  May 30, 2005

22 - 22 - GRID@Large - Lisbon, August 29-30 2005A. Streit Evaluation and allocation of DEISA resources National evaluation committees evaluate the proposals and determine priorities. On the basis of this information, the DEISA consortium examines how the applications map to the resources available in the DEISA pool, and negotiates internally the way the resources will be allocated and the final priorities for projects. Exceptional DEISA resources will be allocated – as in large scientific instruments – at well defined time windows (to be negotiated with the users).

23 - 23 - GRID@Large - Lisbon, August 29-30 2005A. Streit DEISA Extreme Computing Initative DECI Call for Expressions of Interest / Proposals in April and May 2005 50 proposals submitted Requested CPU time: 32 million CPU-hr European countries involved Finland, France, Germany, Greece, Hungary, Italy, Netherlands, Russia, Spain, Sweden, Switzerland, UK Proposals Materials Science, Quantum Chemistry, Quantum Computing: 16 Astrophysics (Cosmology, Stars, Solar Sys.): 13 Life Sciences, Biophysics, Bioinformatics: 8 CFD, Fluid Mechanics, Combustion: 5 Earth Sciences, Climate Research: 4 Plasma Physics: 2 QCD, Particle Physics, Nuclear Physics: 2

24 - 24 - GRID@Large - Lisbon, August 29-30 2005A. Streit Conclusions DEISA adopts Grid technologies to integrate national supercomputing infrastructures, and to provide an European Supercomputing Service. Service activities are supported by the coordinated action of the national center's staffs. DEISA operates as a virtual European supercomputing centre. The big challenge we are facing is enabling new, first class computational science. Integrating leading supercomputing platforms with Grid technologies creates a new research dimension in Europe.

25 - 25 - GRID@Large - Lisbon, August 29-30 2005A. Streit October 11–12, 2005 ETSI Headquarters, Sophia Antipolis, France http://summit.unicore.org/2005 In conjunction with Grids@work: Middleware, Components, Users, Contest and Plugtests http://www.etsi.org/plugtests/GRID.htm Supported by


Download ppt "- 1 - - Lisbon, August 29-30 2005A. Streit DEISA Forschungszentrum Jülich in der Helmholtz-Gesellschaft Achim Streit"

Similar presentations


Ads by Google