Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing The 21 st Century Paradigm for Service- Oriented Utility Computing 2004

Similar presentations


Presentation on theme: "Grid Computing The 21 st Century Paradigm for Service- Oriented Utility Computing 2004"— Presentation transcript:

1 Grid Computing The 21 st Century Paradigm for Service- Oriented Utility Computing 2004 http://www.info.uvt.ro/~petcu/grid.ppt

2 Overview 1.What is Grid? 2.Grid Projects & Applications 3.Grid Technologies 4.Globus 5.CompGrid

3 What is Grid?

4

5

6 Definition A type of parallel and distributed system that enables the sharing, selection, & aggregation of geographically distributed resources: –Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc; –Software – e.g., ASPs renting expensive special purpose applications on demand; –Catalogued data and databases – e.g. transparent access to human genome database; –Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy. –People/collaborators. depending on their availability, capability, cost, and user QoS requirements for solving large-scale problems/applications. thus enabling the creation of “virtual organization” (VOs)

7 (1)Distributed supercomputing support (2)High-throughput computing support (3)On-demand computing support (4)Data-intensive computing support (5)Collaborative computing support (6)Multimedia computing support....

8

9 Resources = assets, capabilities, and knowledge Capabilities (e.g. application codes, analysis tools) Compute Grids (PC cycles, commodity clusters, HPC) Data Grids Experimental Instruments Knowledge Services Virtual Organisations Utility Services

10 Grid‘s main idea To treat CPU cycles and software like commodities. Enable the coordinated use of geographically distributed resources – in the absence of central control and existing trust relationships. Computing power is produced much like utilities such as power and water are produced for consumers. Users will have access to “power” on demand “When the Network is as fast as the computer’s internal links, the machine disintegrates across the Net into a set of special purpose appliances” – Gilder Technology Report June 2000

11 Computational Grids and Electric Power Grids Power Grid analogy – Power producers: machines, software, networks, storage systems – Power consumers: user applications Applications draw power from the Grid the way appliances draw electricity from the power utility. – Seamless, High-performance, Ubiquitous, Dependable Why the Computational Grid is like the Electric Power Grid – Electric power is ubiquitous – Don’t need to know the source of the power (transformer, generator) or the power company that serves it Why the Computational Grid is different from the Electric Power Grid – Wider spectrum of performance – Wider spectrum of services – Access governed by more complicated issues: Security, Performance

12

13

14

15 Distributed Computing Concept has been around for two decades Basic idea: run scheduler across systems to runs processes on least- used systems first – Maximize utilization – Minimize turnaround time Have to load executables and input files to selected resource – Shared file system – File transfers upon resource selection

16 Examples of Distributed Computing Workstation farms etc – Generally share file system SETI@home project, Entropia, etc. – Only one source code; copies correct binary code and input data to each system Napster, Gnutella: file/data sharing NetSolve – Runs numerical kernel on any of multiple independent systems, much like a Grid solution

17

18

19

20

21

22 Internet Computing Projects Area of interestProjects ScienceSETI@Home, eOn, Cimateprediction.com, Distributed Particle Accelerator Design, Analytical Spectroscopy Research, Evolutionary Research Life ScienceFolding@Home, Genome@Home, FightAIDS@Home, Folderol, Distributed Folding, Find-a-Drug, Drig Design Optimization Lab, Community TSC CryptographyDistributed.net, ECCp-109 MathematicsPCP@Home, ZetaGrid, ECMNET, GRISK, Proth/Wilson/Weiferich/Mersenne Prime Search, Factoriztions of Cyclotomic Numbers

23

24

25

26 Why cluster and grid computing Clusters and grids increasingly interesting –more workstations –higher performance per workstation –faster interconnecting networks –price/performance competitive with MPP –enormous unused capacity –cyclic availability

27 Differences parallel/clusters/grids Clusters are inherently inhomogeneous –intrinsic differences in performance, memory, bandwidth –dynamically changing “background load” –ownership of nodes Grids add –differences in administration –disjoint file systems –security etc.

28 P2P, cluster, Internet computing vs. grid computing Peer-to-peer networks (eg Kazaa) fall within the definition of grid computing (the resource shared is the storage capacity of each node) P2P Working Group part of Global Grid Forum A cluster is a resource that can be shared- a grid is a cluster of clusters Internet computing: a VO is assembled for a particular project and disbanded once the project is complete -the shared resource is the Internet connected desktop

29

30

31

32

33

34

35

36

37 Why go Grid? Hot subject Try it, experience it to learn the potential Will enable true ubiquitous computing in future Today, proven in some areas: intraGrids But still long way to World Wide Grid State of art techniques, tools are difficult Short term goals? Use another technology Does your system have Grid characteristics? –Distributed users, large scale and heterogeneous resources, across domains

38 Why? Grids enable much more than apps running on multiple computers – virtual operating system: provides global workspace/address space via a single login – automatically manages files, data, accounts, and security issues – connects other resources (archival data facilities, instruments, devices) and people (collaborative environments) Inevitable (at least in HPC): – leverages computational power of all available systems – manages resources as a single system--easier for users – provides most flexible resource selection and management, load sharing – researchers’ desire to solve bigger problems will always outpace performance increases of single systems; just as multiple processors are needed, ‘multiple multiprocessors’ will be deemed so

39 Why? Resources have different functions, but multiple classes resources are necessary for most interesting problems. Power of any single resource is small compared to aggregations of resources Network connectivity is increasing rapidly in bandwidth and availability Large problems require teamwork and computation

40 What do users want ? Grid Consumers –Execute jobs for solving varying problem size and complexity –Benefit by selecting and aggregating resources wisely –Tradeoff timeframe and cost Grid Providers –Contribute (“idle”) resource for executing consumer jobs –Benefit by maximizing resource utilisation –Tradeoff local requirements & market opportunity

41

42

43

44 Grid projects & applications

45 Proiecte vechi 1996: Ian Foster, Steven Tuecke si Carl Kesselman de la ANL–SUA, infrastructurii de interconectare a celor mai importante centre de calcul de inalta performanta, proiectul I-WAY. 1998: comunitatii de utilizatori internationali si de standarde in grid, Global Grid Forum. RO: 2002 si 2003 cateva proiecte in domeniul grid prin programul InfoSoc EU-FP5: 2000/2001: EuroGrid, DataGrid si Damien (infrastructura,middleware: Geant; Unicore). EU-FP5: 2001/2002: middleware, aplicatii; GridLab – platforma de testare, CrossGrid – interoperabilitate griduri, simulari, EGSO – astro-fizica, GRIA – industrial, DataTag – platforma transatlantica, GRIP - interoperabilitate. EU-FP5: 2002/2003: aplicatii; Avo – astro-fizica, FlowGrid - simulare, OpenMolGrid – molecular, GRACE – cautare, COG – ontologii, MOSES – web semantic, BioGrid – bilogic, GEMSS – medical, SeLeNe – e-learning, MamoGrid – medical, EGEE – securitate, NOMAD – descoperire de servicii (aprox 20 proiecte) EU-FP6: 2003/2004, ‘Grid for complex problem solving’ proiecte nationale: UK - e-Science (80 proiecte) incluzand GridPP, Comb-e-Grid, AstroGrid, MyGrid, GEODISC, DAME, DiscoveryNet, RealityGrid, OGSA-DA; Franta, la INRIA, ruleaza o serie de proiecte: Algorille – management de resurse pe grid, Apache – planificare multicriteriala, Grand – desktop grid, Oasis – grid de increder, Paris – simulari numerice, Remap – servere in retea, Sardies – monitorizare, MPICH-V – MPI pentru grid. In alte tari: Japonia – Grid Data Farm, ITBL, Olanda – VLAM, DutchGrid, Italia – INFN Grid, Irelanda – EireGrid, Polonia – PIONIERGrid, Ungaria – DemoGrid, JiniGrid, Australian –SimGrid, Economy Grid., WWG USA: NASA Information Power Grid, DOE Science Grid, NSF National Virtual Observatory, NSF GriPhyN, DOE Particle Physics Data Grid, NSF DTF TeraGrid, DOE ASCI DISCOM Grid, DOE Earth Systems Grid, DOE FusionGrid, NEESGrid, NIH BIRN, NSF iVDGL. IBM a realizat un Grid Toolkit bazat pe Globus, iar Sun, un One-Grid-Engine.

46 Maturation of Grid Computing Research focus moving from building of basic infrastructure and application demonstrations to – Middleware – Usable production environments – Application performance – Scalability -> Globalization Development, research, and integration happening outside of the original infrastructure groups Grids becoming a first-class tool for scientific communities – GriPhyN (Physics), BIRN (Neuroscience), NVO (Astronomy), Widespread interest from government in developing computational Grid platforms; in US NSF’s Cyberinfrastructure NASA’s Information Power Grid DOE’s Science Grid

47 Grid Applications Distributed HPC (Supercomputing): –Computational science. High-Capacity/Throughput Computing: –Large scale simulation/chip design & parameter studies. Content Sharing (free or paid) –Sharing digital contents among peers (e.g., Napster) Remote software access/renting services: –Application service provides (ASPs) & Web services. Data-intensive computing: –Drug Design, Particle Physics, Stock Prediction... On-demand, real-time computing: –Medical instrumentation & Mission Critical. Collaborative Computing: –Collaborative design, Data exploration, education. Service Oriented Computing (SOC): –Towards economic-based Utility Computing: New paradigm, new applications, new industries, and new business.

48 Grid Projects Australia –Nimrod-G –Gridbus –GridSim –Virtual Lab –DISCWorld –GrangeNet –..new coming up Europe –UNICORE –Cactus –UK eScience –EU Data Grid –EuroGrid –MetaMPI –XtremeWeb –and many more. India –I-Grid  Japan –Ninf –DataFarm Korea... N*Grid USA –Globus –Legion –OGSA –Sun Grid Engine –AppLeS –NASA IPG –Condor-G –Jxta –NetSolve –AccessGrid –and many more... Cycle Stealing &.com Initiatives –Distributed.net –SETI@Home, …. –Entropia, UD, Parabon,…. Public Forums –Global Grid Forum –Australian Grid Forum –IEEE TFCC –CCGrid conference –P2P conference

49 NASA’s IPG Vision for the Information Power Grid is to promote a revolution in how NASA addresses large-scale science and engineering problems by providing persistent infrastructure for – “highly capable” computing and data management services that, on-demand, will locate and coschedule the multi- Center resources needed to address large-scale and/or widely distributed problems – the ancillary services that are needed to support the workflow management frameworks that coordinate the processes of distributed science and engineering problems

50

51

52

53

54 US Grid Projects NASA Information Power Grid DOE Science Grid NSF National Virtual Observatory NSF GriPhyN DOE Particle Physics Data Grid NSF DTF TeraGrid DOE ASCI DISCOM Grid DOE Earth Systems Grid DOE FusionGrid NEESGrid NIH BIRN NSF iVDGL

55 EU GridProjects DataGrid (CERN,..) EuroGrid (Unicore) DataTag (TTT…) Astrophysical Virtual Observatory GRIP (Globus/Unicore) GRIA (Industrial applications) GridLab (Cactus Toolkit) CrossGrid (Infrastructure Components) EGSO (Solar Physics)

56 National Grid Projects UK e-Science Grid Japan – Grid Data Farm, ITBL Netherlands – VLAM, DutchGrid Germany – UNICORE, Grid proposal France – Grid funding approved Italy – INFN Grid Eire – Grid-Ireland Poland – PIONIER Grid Switzerland - Grid proposal Hungary – DemoGrid, Grid proposal ApGrid – AsiaPacific Grid proposal

57 UK e-Science Initiative £75M is for Grid Applications in all areas of science and engineering £10M for Supercomputer upgrade £35M ‘Core Program’ to encourage development of generic ‘industrial strength’ Grid middleware ‘Grid Starter Kit’ Version 1.0 available for distribution from July 2001 Particle Physics and Astronomy (PPARC) –GridPP –– links to EU DataGrid, CERN LHC Computing Project, US GriPhyN and PPDataGrid Projects, and iVDGL Global Grid Project –AstroGrid –– links to EU AVO and US NVO projects Engineering and Physical Sciences (EPSRC) Comb-e-Chem:Structure-Property Mapping DAME: Distributed Aircraft Maintenance Environment Reality Grid: A Tool for InvestigatingCondensed Matter and Materials My Grid: Personalised Extensible Environments for Data Intensive in silicoExperiments in Biology GEODISE: Grid Enabled Optimisation and Design Search for Engineering Discovery Net: High Throughput Sensing Applications Biology, Medical and Environmental Science: Dynamic Brain Atlas, Biodiversity, Chemical Structures, Mouse Genes, Robotic Astronomy. Collaborative Visualisation, Climateprediction.com, Medical Imaging/VR

58 Globus Press Release 12th November 2001 12 Companies adopt Globus Toolkit as Standard Grid Technology Platform 5 new US companies - Compaq, Cray, SGI, Sun, Veridian 3 new Japanese vendors - Fujitsu, Hitachi, NEC 3 US companies increasing commitment - IBM, Microsoft, Entropia Platform will provide commercial version

59

60

61

62 Grid Technologies

63

64

65

66 Grid Requirements Identity & authentication Authorization & policy Resource discovery Resource characterization Resource allocation (Co-)reservation, workflow Distributed algorithms Remote data access High-speed data transfer Performance guarantees Monitoring Adaptation Intrusion detection Resource management Accounting & payment Fault management System evolution Etc.

67 Some Grid Requirements – User Perspective Single allocation: if any at all Single sign-on: authentication to any Grid resources authenticates for all others Single compute space: one scheduler for all Grid resources Single data space: can address files and data from any Grid resources Single development environment: Grid tools and libraries that work on all grid resources

68 The Security Problem Resources being used may be extremely valuable & the problems being solved extremely sensitive Resources are often located in distinct administrative domains – Each resource may have own policies & procedures The set of resources used by a single computation may be large, dynamic, and/or unpredictable – Not just client/server It must be broadly available & applicable – Standard, well-tested, well-understood protocols – Integration with wide variety of tools

69 The Resource Management Problem Enabling secure, controlled remote access to computational resources and management of remote computation – Authentication and authorization – Resource discovery & characterization – Reservation and allocation – Computation monitoring and control

70

71 Some Grid Usage Models Distributed computing: job scheduling on Grid resources with secure, automated data transfer Workflow: synchronized scheduling and automated data transfer from one system to next in pipeline (e.g. compute- viz, storage) Coupled codes, with pieces running on different systems simultaneously Meta- applications: parallel apps spanning multiple systems Some models are similar to models already being used, but are much simpler due to: – single sign-on – automatic process scheduling – automated data transfers But Grids can encompass new resources likes sensors and instruments, so new usage models will arise

72 Grid-based Computation: Challenges Locate “suitable” computers Authenticate with appropriate sites Allocate resources on those computers Initiate computation on those computers Configure those computations Select “appropriate” communication methods Compute with “suitable” algorithms Access data files, return output Respond “appropriately” to resource changes

73 Leading Grid Middleware Developments Globus Toolkit (mainly developed at ANL and USC) Service-oriented toolkit from the Globus project,to be used in Grid applications, not targeted at end- user Services for resource selection and allocation, authentication, file system access and file transfer, … Largest user-base in projects worldwide Open-source software, commercial support by IBM and Platform Computing

74 Leading Grid Middleware Developments UNICORE (European development) Originally developed as a standard gateway for job submission to supercomputers at HPC centers Comfortable GUI for job definition and monitoring (abstracted from system peculiarities) Extended to a Grid job environment, including similar services as provided by the Globus Toolkit Hierarchical job structure, dependencies between tasks at different sites, automatic file transfer GRIP project: Integration of Globus into UNICORE, (task submission into Globus Grid) Open-source software, commercial support by Pallas

75 Leading Grid Middleware Developments LEGION (Univ. of Virginia) Combines distributed resources into a single virtual computer, „OS for a distributed machine“ Legion shell providing services such as naming, file system, security, process generation, inter-process comm., I/O, resource management Open-source software, commercial support by Avaki

76 Examples of Grid Programming Technologies MPICH-G2: Grid-enabled message passing CoG Kits, GridPort: Portal construction GDMP, Data Grid Tools, SRB: replica management, collection management Condor-G: simple workflow management Legion: object models for Grid computing NetSolve: Network enabled solver Cactus: Grid-aware numerical solver framework, application focus

77 Standardization Activities  Web services (under development at W3C), in particular:  Simple Object Access Protocol (SOAP)  Web Service Description Language (WSDL)  Web Service Inspection Language (WSIL)  Web Service Flow Language (WSFL, for workflows)  Open Grid Service Architecture (OGSA, on-going at GGF)  Merger of toolkit model (Globus) with service-oriented approach of Web services  Globus 3.0: partial implementation of OGSA  Semantic Web (W3C)  Resource Description Framework (RDF) for metadata interoperability, based on XML

78 OGSI and OGSA OGSI = Open Grid Service Infrastructure –Specs from GGF OGSI working group –Defines what makes a Grid service –Based on Web service –Naming, life cycle, state, notification –portTypes definitions, WSDL 1.2 draft OGSA = Open Grid Service Architecture –Specs from GGF OGSA working group –Defines a list of fundamental Grid services, and how they cooperate –Work in progress

79 OGSA services Open Grid Service Architecture, being defined by GGF OGSA working group In ubiquitous Grid platform, there is common need for some essential set of interfaces, behaviors, resource models, and bindings OGSA defines the core set of services essential for grid, their functionality and interrelationships Work in progress. Last draft Oct 3, 2003 Core services: service interaction, management, communication, security Non-core: data, program execution, resource management

80 What is a Grid service Defined by OGSI (GGF working group) Is a Web service with extensions, which are: –Name (handle GSH, reference GSR) –Lifetime management (factories, persistent and transient services) –State (Service Data) –Notification as well as querying WSDL 1.2 draft (gwsdl: namespace) Definitions of portTypes

81 Globus Toolkit

82 Globus Grid Services The Globus toolkit provides a range of basic Grid services - Security, information, fault detection, communication, resource management,... These services are simple and orthogonal - Can be used independently, mix and match Programming model independent - For each there are well-defined APIs - Standards are used extensively E.g., LDAP, GSS-API, X.509,... You don‘t program in Globus, it‘s a set of tools like Unix

83

84 The Globus Alliance Globus Project ™, since 1996 –Ian Foster (Argonne National Lab), –Carl Kesselman (University of Southern California’s Information Science Institute) Develop protocols, middleware and tools for Grid computing Globus Alliance, since Sept 2003 International scope –University of Edinburgh’s EPCC –Swedish Center for Parallel Computers (PDC) –Advisory council of Academic Affiliates from Asia- Pacific, Europe, US

85 Globus Toolkit GT2 (2.4 released in 2002): reference implementation of Grid fabric protocols –GRAM for job submissions –MDS for resource discovery –GridFTP for data transfer –GSI security GT3 (3.0 released July 2003): redesign –OGSI based –Grid services, built on SOAP and XML GT3.2 released March 31, 2004

86 Globus Toolkit Services Job submission and management (GRAM) –Uniform Job Submission Security (GSI) –PKI-based Security (Authentication) Service Information services (MDS) –LDAP-based Information Service Remote file management (GASS) and transfer (GridFTP) –Remote Storage Access Service Remote Data Catalogue and Management Tools –Support by Globus 2.0 released in 2002 –Resource selection and allocation (GIIS, GRIS)

87

88

89

90 Resource Specification Language Common notation for exchange of information between components –Syntax similar to MDS/LDAP filters RSL provides two types of information: –Resource requirements: Machine type, number of nodes, memory, etc. –Job configuration: Directory, executable, args, environment API provided for manipulating RSL

91 Job Submission Interfaces Globus Toolkit includes several command line programs for job submission –globus-job-run: Interactive jobs –globus-job-submit: Batch/offline jobs –globusrun: Flexible scripting infrastructure Advanced Grid Job Management Systems –General purpose Nimrod-G, Condor-G, etc –Application specific Active Sheet Cactus, Web portals

92

93 Meaning of GT3 in the community Most commonly referred project Traditionally “de facto standard” Acknowledged leadership by academia and industry (IBM,…) BSD style license allows for commercial usage However, it is only a reference implementation. Now standards = GGF GT undergoes constant changes With business entering grid, commercial implementations may soon catch up

94 Alternatives to GT3 Protocol level interoperability: –“Grid compliant” >= “implements OGSI” Other OGSI implementations –OGSI.NET (U.Virginia) –pyGlobus (LBNL) –.NET (U.Edinburgh) –PERL (U. Manchester) –UNICORE (Fujitsu) Commercial OGSI compliant products by: –Avaki, Platform, Data Synapse, … Web service alternative: Grid App Framework http://www.neresc.ac.uk/ws-gaf

95 Useful References Global Grid Forum: working meeting HPDC: major academic conference Other meetings include IPDPS, CCGrid, EuroGlobus, Globus Retreats Book (Morgan Kaufman): www.mkp.com/grids Perspective on Grids: “The Anatomy of the Grid: Enabling Scalable Virtual Organizations”, IJSA, 2001, www.globus.org/research/papers/anatomy.pdf URLs, especially: www.gridcomputing.com, www.gridbus.org, www.gridforum.org, www.grids-center.org, www.globus.org


Download ppt "Grid Computing The 21 st Century Paradigm for Service- Oriented Utility Computing 2004"

Similar presentations


Ads by Google