Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nguyen Tuan Anh Page 1 Grid Computing – Master Course 2008 Grid Computing Dr. Nguyen Tuan Anh Contacts: Tel: (84-8)

Similar presentations


Presentation on theme: "Nguyen Tuan Anh Page 1 Grid Computing – Master Course 2008 Grid Computing Dr. Nguyen Tuan Anh Contacts: Tel: (84-8)"— Presentation transcript:

1 Nguyen Tuan Anh Page 1 Grid Computing – Master Course 2008 Grid Computing Dr. Nguyen Tuan Anh Contacts: Tel: (84-8) , ext Faculty of Computer Science and Engineering Ho Chi Minh city University of Technology

2 Nguyen Tuan Anh Page 2 Grid Computing – Master Course 2008 Course objectives  Understand “Grid” concepts  Roles of “Grid” in our modern society  Grid architectures  Understand Grid related technologies Security Web services Grid services and standards  Ways to efficiently exploit the Grid power Programming models  “Know-how” Programming the Grid

3 Nguyen Tuan Anh Page 3 Grid Computing – Master Course 2008 Evaluation  Self study and self research Active Efficient  Practical exercises and seminars: 50% Exercises Seminar: each group will study a Grid-related topics and present this in the class  Final exam: 50%

4 Nguyen Tuan Anh Page 4 Grid Computing – Master Course 2008 References  Lecture notes  Books I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure. Morgab Kaufmann Publishers, Maozhen Li, Mark Baker, The Grid Core Technologies, Wiley, Fran Berman, Anthony J. G. Hey and Geoffrey C. Fox, Grid computing: Making the Global Infrastructure a Reality. John Wiley & Sons Ltd, Luis Ferrelra et al. Grid Services Programming and Application Enablement. IBM Redbooks, Mark Endrei et al. Patterns: Service Oriented Architecture and Web Services. IBM Redbooks,  Web links: Globus project: Global Grid Forum:

5 Nguyen Tuan Anh Page 5 Grid Computing – Master Course 2008 Grid introduction

6 Nguyen Tuan Anh Page 6 Grid Computing – Master Course 2008 CERN CERN (founded 1954) = “Conseil Européen pour la Recherche Nucléaire” “European Organisation for Nuclear Reseach” 27 km circumference tunnel L arge H adron C ollider Concorde (15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km) 50 CD-ROM = 35 GB 6 cm

7 Nguyen Tuan Anh Page 7 Grid Computing – Master Course 2008 Un little bit of history …  I.Foster and C.Kesselman come from the parallel computing field Parallelism : The pioneers –Vectorial computers (Cray 1, Cray 2,..) : The exuberance Massively Parallel Computers (CrayT3D/T3E, Connection Machine,…) : The profit Cluster (Beowulf, ASCII Red,..) >? The Globalization GRID Global Computing

8 Nguyen Tuan Anh Page 8 Grid Computing – Master Course 2008 Overview  Grid concepts  Challenges and issues  Grid middleware  Grid applications  Grid projects  Future directions

9 Nguyen Tuan Anh Page 9 Grid Computing – Master Course 2008 Part I: concepts

10 Nguyen Tuan Anh Page 10 Grid Computing – Master Course 2008 What is “Grid”?  1001 definitions HPC researchers Connect supercomputing sites Distributed supercomputing Web engineers Stateful web services for accessing resources Next generation of the Internet and Web Companies Services accessible by other users Portal to access local resources Users Like electrical grid …

11 Nguyen Tuan Anh Page 11 Grid Computing – Master Course 2008 Grid environment 11

12 Nguyen Tuan Anh Page 12 Grid Computing – Master Course 2008 « Grid » domains  Originally To link together supercomputing sites  Today Collaborative engineering Data exploration High-throughput computing Distributed high performance computing

13 Nguyen Tuan Anh Page 13 Grid Computing – Master Course 2008 A little bit of vocabulary  Usually one distinguishes three types of GRID Knowledge GRID To share knowledge Idea: Virtual distributed laboratory Data GRID To share the data Idea: Large scale data storage Computing GRID To share computing power Idea: Alternative to supercomputers

14 Nguyen Tuan Anh Page 14 Grid Computing – Master Course 2008 The GRID today -1  Knowledge GRID Virtual distributed Institutes/Laboratories allowing remote people to collaborate as they would be at the same location Purpose: to share knowledge, to collaborate for researches or specific activities Few specific tries Remote surgery ….

15 Nguyen Tuan Anh Page 15 Grid Computing – Master Course 2008 The GRID today - 2  Data GRID Also referenced as « peer-to-peer » technology Versus to the « Client/Server » technology who dominated the market of distributed systems during last decades, with this approach every participant is both client and server Nobody knows exactly where the data are The system figures to find the nearest requested data out The requested data could be distributed on several location and same data could be located at several places (data coherence problem) The pursuit of the data is transparent to the user Some examples: Napster, Gnutella, pirate software, Project DATAGRID (CERN),…

16 Nguyen Tuan Anh Page 16 Grid Computing – Master Course 2008 The GRID today - 3  Computing GRID A supercomputer on you the table of your kitchen …  Some facts Supercomputers are very expensive and age very rapidly Except for some very specific persons, we do not need a supercomputer every day… … but when we need one it is very frustrating not to have one ! The total time personal and other computer waste time not doing anything is huge I am (almost) permanently connected to a lot of computers. Potentially I am permanently connected to a supercomputer First try to exploit this fact: Projet

17 Nguyen Tuan Anh Page 17 Grid Computing – Master Course 2008 GRID computing : different views  Virtual Supercomputing An association of several supercomputers geographically distributed ( ) Each node is a parallel machine managed through a scheduling system (batch) To provide a virtual supercomputer  Internet computing A combination of a large number of PCs ( ) To exploit unused computing  Metacomputing An association of several servers providing a set of applications

18 Nguyen Tuan Anh Page 18 Grid Computing – Master Course 2008 “Grid” vs. cluster computing  “Partial view” vs. “Global view” of the environment  Large scale, geographically distributed  Non-structured  Heterogeneous  Dynamic, unstable  Multiple administrative domains

19 Nguyen Tuan Anh Page 19 Grid Computing – Master Course 2008 Supercomputer PC Supercomputer My laptop Workstation SMP Cluster Computing Grid: our goal ?

20 Nguyen Tuan Anh Page 20 Grid Computing – Master Course 2008 The dream  To make computers user friendly: To switch from a program centric view… « Execute this program on this (these) machine(s)» to a service centric view… « Provide me this service with this QoS level » Today the WEB provides access to a large amount of documents and data BUT…. Access is not transparent…. URL To find the correct information is a tedious work… search engines The GRID will change our way to use computer as many drastically that WEB did more than 10 years ago

21 Nguyen Tuan Anh Page 21 Grid Computing – Master Course 2008 New characteristics  Large-scale Need for dynamic selection Partial view of the environment  Heterogeneity Hardware, OS, network, software environments (languages, libraries, tools...)  Complex unpredictable structure  Dynamic unpredictable behavior  Multiple administrative domain no centralized control

22 Nguyen Tuan Anh Page 22 Grid Computing – Master Course 2008 Challenges  Efficient exploitation of Grid performance New programming models Service-centric vision  Resource connectivity Middleware services Applications  Interoperability vs. performance Good interoperability -> (very) slow Fast -> not so interoperable  Grid scalability Resource management mechanisms/models Large number of heterogeneous resources  Grid infrastructure and Grid application evaluations How to benchmark Grid systems and applications  Trust and security How to trust the environment How to protect sensitive data

23 Nguyen Tuan Anh Page 23 Grid Computing – Master Course 2008 Part II: Grid Middleware & components

24 Nguyen Tuan Anh Page 24 Grid Computing – Master Course 2008 What is Grid middleware  System software between applications and OS Provide services to applications Discovery Execution Storage Data movements Information Service integration... Failure detection and recovery Resource monitoring... Hide all complexities of the Grid environment

25 Nguyen Tuan Anh Page 25 Grid Computing – Master Course 2008 Grid layered architecture: principle

26 Nguyen Tuan Anh Page 26 Grid Computing – Master Course 2008 Grid architecture  Fabric layer Physical resources Computers/ computing platforms Data storages Sensors Network Layered on top of local OS  Resource & connectivity protocols layer Provide access to resources and services in the fabric layer Security management Authentication Authorization Secure communication

27 Nguyen Tuan Anh Page 27 Grid Computing – Master Course 2008 Grid architecture  Collective service layers Facilitate the use of resources/services Provide meta-services Resource discovery Resource brokers Failure detection...  User application layers Grid applications Parallel applications Service-oriented applications...

28 Nguyen Tuan Anh Page 28 Grid Computing – Master Course 2008 Community Grid Architecture Model

29 Nguyen Tuan Anh Page 29 Grid Computing – Master Course 2008 Service-centric view: Grid environment

30 Nguyen Tuan Anh Page 30 Grid Computing – Master Course 2008 Virtual organization (VO)  Grid environment can be organized as VOs  VO: groups of services/resources Similar characteristics Location/site Functions A VO can a part of a bigger VO

31 Nguyen Tuan Anh Page 31 Grid Computing – Master Course 2008 Grid evolution 1996 GT1.0 Proof of concept 1999 GUSTO testbed 2002 GT3.0 OGSA 2005 GT4.x WSRF GGF Standardization -Basic services? -Security issues - Grid architecture - Grid interoperability Globus Toolkit !!!!

32 Nguyen Tuan Anh Page 32 Grid Computing – Master Course 2008 Globus layered architecture Applications Core Services Metacomputing Directory Service GRAM Globus Security Interface Heartbeat Monitor Nexus Gloperf Local Services LSF CondorMPI NQEEasy TCP SolarisIrixAIX UDP High-level Services and Tools POP-C++ globusrunMPINimrod/GProActiveCC++ GlobusViewTestbed Status GASS

33 Nguyen Tuan Anh Page 33 Grid Computing – Master Course 2008 Globus Toolkit 4x  Sustainable changes on the services interoperability and infrastructure Open Grid Services Architecture (OGSA) Stateful Web Services Enable the integration of user specific Grid services Define standard interfaces How to access Grid services  Disadvantage Slow

34 Nguyen Tuan Anh Page 34 Grid Computing – Master Course 2008 GT: Core service architecture

35 Nguyen Tuan Anh Page 35 Grid Computing – Master Course 2008 Globus Toolkit  Grid Service Specification How to write, publish and use a Grid service  GT components: GT core Meta-services used to implement other services and service behaviors (e.g. service creation, destruction) GT base services Use the GT core to implement Grid capacities: resource management, information services, data transfer, etc. Other Grid services Implemented by user to enable some enhancement capacities

36 Nguyen Tuan Anh Page 36 Grid Computing – Master Course 2008 GT: from other perspectives GT container GT security Local-level services  WS GRAM Fork PBS LSF  GridFTP  RFTP … VO-level services  Information service  …??? create

37 Nguyen Tuan Anh Page 37 Grid Computing – Master Course 2008 What GT DOES NOT address  GT focus on accessing local resources  Things still missing Coordination services Resource/service discovery Information collection Resource connectivity Programming models/tools  Things to be improved Performance!

38 Nguyen Tuan Anh Page 38 Grid Computing – Master Course 2008 Part III: Grid security

39 Nguyen Tuan Anh Page 39 Grid Computing – Master Course 2008 Grid security: a scenario

40 Nguyen Tuan Anh Page 40 Grid Computing – Master Course 2008 Symmetric key

41 Nguyen Tuan Anh Page 41 Grid Computing – Master Course 2008 Asymmetric cryptography Unique Mathematical relation

42 Nguyen Tuan Anh Page 42 Grid Computing – Master Course 2008 How to guarantee the information integrity?

43 Nguyen Tuan Anh Page 43 Grid Computing – Master Course 2008 Digital Signatures  A message signed by the private-key

44 Nguyen Tuan Anh Page 44 Grid Computing – Master Course 2008 How to identify users? (User authentication)

45 Nguyen Tuan Anh Page 45 Grid Computing – Master Course 2008 Authentication models Direct-trusted modelThird-party trusted model SenderReceiver 1: Register sender’s public key 2: store trusted public key 3: message M 4: sign(M) with sender’s private key Key repository 5: public key existed? SenderReceiver Trusted authority Digital certificate

46 Nguyen Tuan Anh Page 46 Grid Computing – Master Course 2008 Digital certificate  Based on asymmetric cryptography  Consist of Public key User identity information (user name, organization, address, …) One or more digital signature signed by some well known Certificate Authority (CA)

47 Nguyen Tuan Anh Page 47 Grid Computing – Master Course 2008 Certificate authority  Responsible for Positively identify entities requesting certificates Issuing, removing, and archiving certificates Protecting the Certificate Authority server Maintaining a namespace of unique names for certificate owners Serve signed certificates to those needing to authenticate entities Logging activity

48 Nguyen Tuan Anh Page 48 Grid Computing – Master Course 2008 Access the Grid

49 Nguyen Tuan Anh Page 49 Grid Computing – Master Course 2008 Grid authentication

50 Nguyen Tuan Anh Page 50 Grid Computing – Master Course 2008 Grid Delegation

51 Nguyen Tuan Anh Page 51 Grid Computing – Master Course 2008 Delegation

52 Nguyen Tuan Anh Page 52 Grid Computing – Master Course 2008 Grid authentication and delegation Source: Ian Foster et al: A security architecture for computational Grids

53 Nguyen Tuan Anh Page 53 Grid Computing – Master Course 2008 Part IV: Grid applications

54 Nguyen Tuan Anh Page 54 Grid Computing – Master Course 2008 Computing GRID: the issue ….  Supercomputer, cluster,… How to extract the 99,999999% of the computing power of my limited powered expensive environment  GRID environment How to extract the very power I need from the theoretically infinite powered cheap environment  Consequence Speedup/efficiency curves are not any more relevant information..

55 Nguyen Tuan Anh Page 55 Grid Computing – Master Course 2008 Grid vs. Cluster computing from application view  Cluster Have applications, build a cluster for those applications High efficiency but expensive  Grid infrastructure Have existing platforms, find applications that can efficiently run on those platforms Cheap but not well tailored to every application

56 Nguyen Tuan Anh Page 56 Grid Computing – Master Course 2008 Types of Grid applications  Type 1: Traditional HPC applications running within a site (VO) Using traditional models (MPI, PVM,…) Ready-to-run, no need to modify/re-compile Role of the Grid middleware –Resource discovery –Deploy and run the application remotely, securely on the discovered resource

57 Nguyen Tuan Anh Page 57 Grid Computing – Master Course 2008 Types of Grid applications  Type 1: Traditional HPC applications running within a site (VO) Using traditional models (MPI, PVM,…) Ready-to-run, no need to modify/re-compile Role of the Grid middleware –Resource discovery –Deploy and run the application remotely, securely on the discovered resource

58 Nguyen Tuan Anh Page 58 Grid Computing – Master Course 2008 Types of Grid applications  Type 2: New HPC applications running across multiple sites (VOs) Require new programming models/tools –Multiple level parallelism –Embracing parallelism Example: bio-informatics, parameter sweeping Huge speedup can be achieved Very few applications Role of the Grid middleware Resource discovery Resource allocation and co-allocation Application supporting services Dynamic deployments and executions of application components

59 Nguyen Tuan Anh Page 59 Grid Computing – Master Course 2008 Issues  Missing high-level services QoS of resources  Heterogeneity  Code portability Binary/Byte code or source code?  Resource connectivity Firewall/NAT/ Virtual IP  Fault tolerance Resource volatility  Data protection Protect sensitive data from stealing

60 Nguyen Tuan Anh Page 60 Grid Computing – Master Course 2008 Part V: Grid in the world

61 Nguyen Tuan Anh Page 61 Grid Computing – Master Course 2008 United Kingdom  E-Science, 215M€ over 5 years e-Science will refer to the large scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. 1 - National Network of Grid Centers 2 - Development of Generic Grid Middleware 3 - Support for e-Science Projects. e-Science support centre Grid Network Team Grid Engineering Task Force Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast

62 Nguyen Tuan Anh Page 62 Grid Computing – Master Course 2008 Germany  D-GRID, 100M€ over 5 years ( ) More than 100 academia and industry partners  Phase 1 -> 2008: Construction of a GRID infrastructure, development of the middleware and a pilot to run GRID applications  Phase 2: the GRID infrastructure will be used to run e-science (enhanced science) projects In close collaboration with IBM 10 Gbit/s 2,4 Gbit/s 622 Mbit/s

63 Nguyen Tuan Anh Page 63 Grid Computing – Master Course 2008 Nederland  Virtual Laboratory for e-science (VL-e), 55 M€ over 5 years 21 partners in 19 institutions  The mission of the VL-E project is: To boost e-Science by the creation of an e-Science environment and doing research on methodologies.  The strategy will be: To carry out concerted research along the complete e-Science technology chain, ranging from applications to networking, focused on new methodologies and reusable components.  The essential components of the total e-Science technology chain are: e-Science development areas, a Virtual Laboratory development area, a Large Scale Distributed computing development area, consisting of high performance networking and grid parts.

64 Nguyen Tuan Anh Page 64 Grid Computing – Master Course 2008 France  ACI-GRID, ~8 M€ Based on « call for proposals » Use RENATER network  GRID 5000 Building a nation wide experimental platform for Grid researches (like a particle accelerator for the computer scientists) 10/11 geographically distributed sites, every site hosts a cluster (from 256 CPUs to 1K CPUs) All sites are connected by RENATER (French Academ. Network) RENATER hosts probes to trace network condition load Design and develop a system/middleware environment for safely test and repeat experiments Only experimental platform (no production)

65 Nguyen Tuan Anh Page 65 Grid Computing – Master Course 2008 Europe - CERN  DATAGRID 10M€, ended beginning partners Feasibility project, final test bed 1000 computers, 15 Terabytes on 25 sites  Followed by…  EGEE, 4 years, 40 M€ for the first two years 70 partners in 27 countries To provide the necessary storage and computing infrastructure to LHC (and others..)

66 Nguyen Tuan Anh Page 66 Grid Computing – Master Course 2008 …and at HCMUT….

67 Nguyen Tuan Anh Page 67 Grid Computing – Master Course 2008 VN-Grid: toward a national-scale computing Grid  Main focus: infrastructure High-level services Resource discovery and reservation Scheduling VO and policy management OGSA and WSRF compliance Programming support MPI POP-C++  We do not develop from scratch! Using GT for providing base services

68 Nguyen Tuan Anh Page 68 Grid Computing – Master Course 2008 A VN-Grid scenario Site Resource Discovery Resource Allocation/Reservation Scheduling Learning Data warehouse Monitoring

69 Nguyen Tuan Anh Page 69 Grid Computing – Master Course 2008 Our first prototype-to-be-built  Keep in mind the heterogeneity the dynamics “Virtual site” concept (VSite) Combine the flexibility of P2P technologies (partial view assumption) with the efficiency of centralized management on each VSite Flexible to involve more resources Flexible security management Multiple level authentication and authorization (VO, user,…)  Programming supports Parallel object model (POP-C++) MPI  Applications Oil exploitation (geo-physic data computation of oil fields) Supraconductor study Aviation Chip Design

70 Nguyen Tuan Anh Page 70 Grid Computing – Master Course 2008 VN-Grid testbed

71 Nguyen Tuan Anh Page 71 Grid Computing – Master Course 2008 Other projects  Factors that could lead to the success of VN-Grid SuperNode POP-C++ EDAGrid

72 Nguyen Tuan Anh Page 72 Grid Computing – Master Course 2008 Supernode project  Building HPC system with programming supports Supernode 1 ( ) Supernode 2 ( ) 4 Nodes Configurable network topology MPI interface 64 Nodes (dual Xeon) Fast network (32 links, 1Gb/s each) 330 GFlops effective performance MPI interface

73 Nguyen Tuan Anh Page 73 Grid Computing – Master Course 2008 User Management and Billing User Proxy Resource Management Job Execution Job Management MonitoringScheduling Supernode II Architechture Transparency!

74 Nguyen Tuan Anh Page 74 Grid Computing – Master Course 2008 Supernode II: tools

75 Nguyen Tuan Anh Page 75 Grid Computing – Master Course 2008 Object oriented application POP-C++: Parallel Object Programming Grid environment Object Obj Object Heterogeneous Large scale Unstructured Dynamic and unknown topology Distributed objects Heterogeneous execute Object Obj Object

76 Nguyen Tuan Anh Page 76 Grid Computing – Master Course 2008 Parallel object by extending C++ parclass Foo { od.power(100);}; async conc void MyMethod(); }; parclass Foo { od.power(100);}; async conc void MyMethod(); }; class Foo { Foo(); void MyMethod(); }; class Foo { Foo(); void MyMethod(); }; Foo::Foo() {... } void Foo::MyMethod(){...} Foo::Foo() {... } void Foo::MyMethod(){...} POP-C++ C++ Shared implementation Grid Programming Grid infrastructure Transparent, requirement driven object allocation Various method invocation semantics Architecture

77 Nguyen Tuan Anh Page 77 Grid Computing – Master Course 2008 EDAGRID  Computing Grid for microchip design Synthesis and optimization Smaller Faster More power efficient Self verification

78 Nguyen Tuan Anh Page 78 Grid Computing – Master Course 2008 Behavioral Specification RTL Description TranslationLogic Optimization Technology Mapping Optimized Logic Description Unoptimized Logic Description Microchip design process Product Requirement Physical Design (back-end) Library Constraints

79 Nguyen Tuan Anh Page 79 Grid Computing – Master Course 2008 Future directions  OGSA/WSRF based services  Target Grid environment Data Grid or service Grid or computing Grid?  Global services Grid topology Resource discovery Distributed information system Resource connectivity Fault tolerance  Local services Acquire resource information Security management  Grid enabled Applications Programming models Supporting tools

80 Nguyen Tuan Anh Page 80 Grid Computing – Master Course 2008

81 Nguyen Tuan Anh Page 81 Grid Computing – Master Course 2008 Globus Toolkit (GT) Grid-enabled Applications Core Services Information Service GRAM Globus Security Heartbeat Monitor Nexus Gloperf Local Services LSF CondorMPI NQEEasy TCP SolarisIrixAIX UDP High-level Services and Tools DUROCglobusrunMPINimrod/GMPI-IOCC++ POP-C++Testbed Status GASS

82 Nguyen Tuan Anh Page 82 Grid Computing – Master Course 2008 Globus Toolkit 4x  Sustainable changes on the services interoperability and infrastructure Open Grid Services Architecture (OGSA) Stateful Web Services Enable the integration of user specific Grid services Define standard interfaces How to access Grid services  Disadvantage Slow

83 Nguyen Tuan Anh Page 83 Grid Computing – Master Course 2008 GT: Core service architecture

84 Nguyen Tuan Anh Page 84 Grid Computing – Master Course 2008 Globus Toolkit  Grid Service Specification How to write, publish and use a Grid service  GT components: GT core Meta-services used to implement other services and service behaviors (e.g. service creation, destruction) GT base services Use the GT core to implement Grid capacities: resource management, information services, data transfer, etc. Other Grid services Implemented by user to enable some enhancement capacities

85 Nguyen Tuan Anh Page 85 Grid Computing – Master Course 2008 What GT DOES NOT address  GT focus on accessing local resources  Things still missing Coordination services Resource/service discovery Information collection Resource connectivity Programming models/tools  Things to be improved Performance!


Download ppt "Nguyen Tuan Anh Page 1 Grid Computing – Master Course 2008 Grid Computing Dr. Nguyen Tuan Anh Contacts: Tel: (84-8)"

Similar presentations


Ads by Google