1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK

Slides:



Advertisements
Similar presentations
Legacy code support for commercial production Grids G.Terstyanszky, T. Kiss, T. Delaitre, S. Winter School of Informatics, University.
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
High Performance Computing Course Notes Grid Computing.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Services Abderrahman El Kharrim
Introduction to the Grid
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Globus Toolkit 4 hands-on Gergely Sipos, Gábor Kecskeméti MTA SZTAKI
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Workload Management Massimo Sgaravatto INFN Padova.
Grids and Globus at BNL Presented by John Scott Leita.
Simo Niskala Teemu Pasanen
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
OPEN GRID SERVICES ARCHITECTURE AND GLOBUS TOOLKIT 4
DISTRIBUTED COMPUTING
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Computational grids and grids projects DSS,
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Grid Workload Management Massimo Sgaravatto INFN Padova.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
The Grid computing Presented by:- Mohamad Shalaby.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Applications.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
1 Porting applications to the NGS, using the P-GRADE portal and GEMLCA Peter Kacsuk MTA SZTAKI Hungarian Academy of Sciences Centre for.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Introduction to Grid and Grid applications Peter Kacsuk MTA SZTAKI
Grid Computing: Running your Jobs around the World
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Globus —— Toolkits for Grid Computing
Grid Computing.
University of Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
gLite The EGEE Middleware Distribution
Presentation transcript:

1 Overview of Grid middleware concepts Peter Kacsuk MTA SZTAKI, Hungary Univ. Westminster, UK

2 The goal of this lecture To overview the main trends of the fast evolution of Grid systems Explaining the main features of the three generation of Grid systems –1 st gen. Grids: Metacomputers –2 nd gen. Grids: Resource-oriented Grids –3 rd gen. Grids: Service-oriented Grids To show how these Grid systems can be handled by the users

3 OGSA/OGSI Super- computing Network Computing Cluster computing High-throughput computing High-performance computing Web Services CondorGlobus Client/server Progress in Grid Systems OGSA/WSRF Grid Systems 2nd Gen. 3rd Gen. 1st Gen.

4 1st Generation Grids Metacomputers

5 Original motivation for metacomputing Grand challenge problems run weeks and months even on supercomputers and clusters Various supercomputers/clusters must be connected by wide area networks in order to solve grand challenge problems in reasonable time

6 Progress to Metacomputers Single processor 2100 Cluster Meta- computer TFlops Time Super- computer

7 Original meaning of metacomputing Wide area network Original goal of metacomputing: Distributed supercomputing to achieve higher performance than individual supercomputers/clusters can provide Super computing + Metacomputing =

8 Distributed Supercomputing Issues: –Resource discovery, scheduling –Configuration –Multiple comm methods –Message passing (MPI) –Scalability –Fault tolerance NCSA Origin Caltech Exemplar Argonne SP Maui SP SF-Express Distributed Interactive Simulation (SC’1995)

9 High-throughput computing (HTC) and the Grid Better usage of computing and other resources accessible via wide area network To exploit the spare cycles of various computers connected by wide area networks Two main representatives –SETI –Condor

10 OGSA/OGSI Super- computing Network Computing Cluster computing High-throughput computing High-performance computing Web Services CondorGlobus Client/server Progress in Grid Systems OGSA/WSRF Grid Systems 1st Gen.

11 TCP/IP Resource requirement ClassAdds Match-maker Resource requestor Resource provider Publish (configuration description) Client program moves to resource(s) Security is a serious problem! The Condor model

12ClassAds Resources of the Grid have different properties (architecture, OS, performance, etc.) and these are described as advertisements (ClassAds) Creating a job, we can describe our requirements (and preferencies) for these properties. Condor tries to match the requirements and the ClassAds to provide the most optimal resources for our jobs.

13 your workstation personal Condor jobs The concept of personal Condor

14 your workstation personal Condor jobs Group Condor The concept of Condor pool

15 Architecture of a Condor pool Central Manager Submit Machine Execution Machines

16 Components of a Condor pool

17 your workstation friendly Condor personal Condor jobs Group Condor Your schedd daemons see the CM of the other pool as if it was part of your pool The concept of Condor flocking

18 Condor flocking “grids” Condor pool Client machine Resources do not meet the requirements of the job: forward it to a friendly pool Resources do meet the requirements of the job: execute it submit Job

19 your workstation friendly Condor personal Condor jobs Grid PBS LSF Condor Group Condor glide-ins The concept of glide-in

20 Three levels of scalability in Condor 2100 Among nodes of a cluster Grid Gliding Flocking among clusters 2100

21 NUG30 - Solved!!! Number of workers Solved in 7 days instead of 10.9 years

22 NUG30 Personal Grid … Managed by one Linux box at Wisconsin Flocking: -- Condor pool at Wisconsin (500 processors) -- Condor pool at Georgia Tech (284 Linux boxes) -- Condor pool at UNM (40 processors) -- Condor pool at Columbia (16 processors) -- Condor pool at Northwestern (12 processors) -- Condor pool at NCSA (65 processors) -- Condor pool at INFN Italy (54 processors) Glide-in: -- Origin 2000 (through LSF ) at NCSA. (512 processors) -- Origin 2000 (through LSF) at Argonne (96 processors)

23 Problems with Condor flocking “grids” Friendly relationships are defined statically. Firewalls are not allowed between friendly pools. Client can not choose resources (pools) directly. Private (non-standard) “Condor protocols” are used to connect friendly pools together. Not service-oriented

24 2nd Generation Grids Resource-oriented Grid

25 The main goal of 2 nd gen. Grids To enable a –geographically distributed community [of thousands] –to perform sophisticated, computationally intensive analyses –on large set (Petabytes) of data To provide –on demand –dynamic resource aggregation –as virtual organizations Example virtual organizations : –Physics community (EDG, EGEE) –Climate community, etc.

26 Resource intensive issues include Harness data, storage, computing and network resources located in distinct administrative domains Respect local and global policies governing what can be used for what Schedule resources efficiently, again subject to local and global constraints Achieve high performance, with respect to both speed and reliability

27 Grid Protocols, Services and Tools Protocol-based access to resources –Mask local heterogeneities –Negotiate multi-domain security, policy –“Grid-enabled” resources speak Grid protocols –Multiple implementations are possible Broad deployment of protocols facilitates creation of services that provide integrated view of distributed resources Tools use protocols and services to enable specific classes of applications

28 The Role of Grid Middleware and Tools Remote monitor Remote access Information services Data mgmt... Resource mgmt Collaboration Tools Data Mgmt Tools Distributed simulation... net Credit to Ian Foster

29 OGSA/OGSI Super- computing Network Computing Cluster computing High-throughput computing High-performance computing Web Services CondorGlobus Client/server Progress in Grid Systems OGSA/WSRF Grid Systems 2nd Gen.

30 Solutions by Globus (GT-2) Creation of Virtual Organizations (VOs) Standard protocols are used to connect Globus sites Security issues are basically solved –Firewalls are allowed between Grid sites –PKI: CAs and X.509 certificates –SSL for authentication and message protection The client does not need account on every Globus site: –Proxies and delegation for secure single Sign-on Still: –provides metacomputing facilities (MPICH-G2) –Not service-oriented either

31 GRAM API MDS-2 API Resource description MDS-2) Resource requestor Resource provider Publish (configuration description) Client program moves to resource(s) Security is a serious problem! The Globus-2 model

32 The Role of the Globus Toolkit A collection of solutions to problems that come up frequently when building collaborative distributed applications Heterogeneity –A focus, in particular, on overcoming heterogeneity for application developers Standards –We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF) –GT also includes reference implementations of new/proposed standards in these organizations

33 Without the Globus Toolkit Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service A B C D E Application Developer 10 Off the Shelf 12 Globus Toolkit 0 Grid Community 0

34 A possibility with the Globus Toolkit Web Browser Compute Server Globus MCS/RLS Data Viewer Tool Certificate Authority CHEF Chat Teamlet MyProxy CHEF Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Globus Index Service Globus GRAM Globus DAI Application Developer 2 Off the Shelf9 Globus Toolkit 8 Grid Community 3

35 Data MgmtSecurity Common Runtime Execution Mgmt Info Services Authentication Authorization (GSI) GridFTP C Common Libraries Globus Toolkit version 2 (GT2) Grid Resource Alloc. Mgmt (GRAM) Monitoring & Discovery (MDS) User applications & Higher level services

36 Globus Components Grid Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary Client MDS: Grid Index Info ServerGatekeeperMDS: Grid Resource Info ServerLocal Resource Manager MDS client API calls to get resource info GRAM client API state change callbacks

37 Example 1 for a GT2 Grid: TeraGrid HPSS 574p IA-32 Chiba City 128p Origin HR Display & VR Facilities Myrinet 1176p IBM SP Blue Horizon Sun E10K 1500p Origin UniTree 1024p IA p IA-64 HPSS 256p HP X-Class 128p HP V p IA-32 NCSA: Compute-Intensive ANL: Visualization Caltech: Data collection and analysis applications SDSC: Data-oriented computing Credit to Fran Berman

38 TeraGrid Common Infrastructure Environment Linux Operating Environment Basic and Core Globus Services –GSI (Grid Security Infrastructure) –GSI-enabled SSH and GSIFTP –GRAM (Grid Resource Allocation & Management) –GridFTP –Information Service –Distributed accounting –MPICH-G2 –Science Portals Advanced and Data Services –Replica Management Tools –GRAM-2 (GRAM extensions) –Condor-G (as brokering “super scheduler”) –SDSC SRB (Storage Resource Broker) Credit to Fran Berman

39 Example 2 for a GT2 Grid: LHC Grid and LCG-2 LHC Grid –A homogeneous Grid developed by CERN –Restrictive policies (global policies overrule local policies) –A dedicated Grid to the Large Hydron Collider experiments LCG-2 –A homogeneous Grid developed by CERN and the EDG and EGEE projects –Restrictive policies (global policies overrule local policies) –A non-dedicated Grid –Works 24 hours/day and has been used in EGEE and EGEE- related Grids (SEEGRID, BalticGrid, etc.)

40 Main Logical Machine Types (Services) in LCG-2 User Interface (UI) Information Service (IS) Computing Element (CE) –Frontend Node –Worker Nodes (WN) Storage Element (SE) Replica Catalog (RC,RLS) Resource Broker (RB)

41 The LCG-2 Architecture Collective Services Information & Monitoring Replica Manager Grid Scheduler Local Application Local Database Underlying Grid Services Computing Element Services Authorization Authentication & Accounting Replica Catalog Storage Element Services Database Services Fabric services Configuration Management Configuration Management Node Installation & Management Node Installation & Management Monitoring and Fault Tolerance Monitoring and Fault Tolerance Resource Management Fabric Storage Management Fabric Storage Management Grid Fabric Local Computing Grid Grid Application Layer Data Management Job Management Metadata Management Logging & Book- keeping

42 3rd Generation Grids Service-oriented Grids OGSA (Open Grid Service Architecture) and WSRF (Web Services Resource Framework)

43 OGSA/OGSI Super- computing Network Computing Cluster computing High-throughput computing High-performance computing Web Services CondorGlobus Client/server Progress in Grid Systems OGSA/WSRF Grid Systems 3rd Gen.

44 (SOAP) Predefined programs (services) wait for invocation Much more secure than the GT-2 concept The Web Services model UDDI provider

45 Grid and Web Services: Convergence Grid Web However, despite enthusiasm for OGSI, adoption within Web community turned out to be problematic Started far apart in apps & tech OGSI/GT3 GT2 GT1 HTTP WSDL, WS-* WSDL 2, WSDM Have been converging ?

46 Concerns Too much stuff in one specification Does not work well with existing Web services tooling Too object oriented

47 Grid and Web Services: Convergence Grid Web The definition of WSRF means that Grid and Web communities can move forward on a common base WSRF Started far apart in apps & tech OGSI GT2 GT1 HTTP WSDL, WS-* WSDL 2, WSDM Have been converging

48 Layered diagram of OGSA, GT4, WSRF, and Web Services

49 Relationship between OGSA, GT4, WSRF, and Web Services

50 Towards GT4 production Grids Stable highly-available GT2 production Grid Extension with GT4 site and services by UoW Westminster UoW (Univ of Westminster) Core members: Manchester CCLRC RAL Oxford Leeds CSAR HPCx Partner sites Bristol Cardiff Lancaster Data clusters Compute clusters National HPC services

51 Workload ManagementData Management Security Information & Monitoring Access gLite Grid Middleware Services API Computing Element Workload Management Metadata Catalog Storage Element Data Movement File & Replica Catalog Authorization Authentication Information & Monitoring Application Monitoring Auditing Job Provenance Package Manager CLI Accounting Site Proxy Overview paper

52 Conclusions Fast evolution of Grid systems and middleware: –GT1, GT2, OGSA, OGSI, GT3, WSRF, GT4, … Current production scientific Grid systems are built based on 1 st and 2 nd gen. Grid technologies Enterprise Grid systems are emerging based on the new OGSA and WSRF concepts