Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to Grid Computing BEAM Workshop December 2004 Mark Servilla LTER Network Office.

Similar presentations

Presentation on theme: "An Introduction to Grid Computing BEAM Workshop December 2004 Mark Servilla LTER Network Office."— Presentation transcript:

1 An Introduction to Grid Computing BEAM Workshop December 2004 Mark Servilla LTER Network Office

2 SEEK-BEAM Workshop Dec 20042 Presentation Agenda  Definitions  Evolution of the Grid  Characteristics  Computing Model  Protocols  Examples  References

3 SEEK-BEAM Workshop Dec 20043 Definitions of a Grid  “… a network of conductors for distribution of electric power; also : a network of radio or television stations” – Merriam-Webster  “… the illusion of a simple yet large and powerful self- managing virtual computer out of a large collection of connected heterogeneous systems sharing various combinations of resources” – IBM Redbooks  “Grid Computing enables virtual organizations to share geographically distributed resources as they pursue common goals, assuming the absence of central location, central control, omniscience, and an existing trust relationship.” – Globus Alliance  “The Web provides us information — the grid allows us to process it.” - Ahmar Abbas

4 SEEK-BEAM Workshop Dec 20044 The Evolution of Grid Technology  High-Performance Computing  Cluster Computing  Peer-to-Peer Computing  Internet Computing

5 SEEK-BEAM Workshop Dec 20045 High-Performance Computing  Traditionally known as super- computing  Specialized for parallel processing algorithms  Shared equally among academia, research, and commercial sectors

6 SEEK-BEAM Workshop Dec 20046 Cluster Computing  Originated 1994 – Beowulf cluster NASA  High-performance  Massively-parallel (2 to 1000 nodes)  Commodity hardware (Intel, AMD)  Low-cost software (Linux, FreeBSD)  Interconnected via high-speed private networks  Shared storage SAN/NAS  AMD Athlon cluster at University of Heidelberg, Germany – 825Gflops, 35 th fastest high- performance computer in the world

7 SEEK-BEAM Workshop Dec 20047 Cluster Computing

8 SEEK-BEAM Workshop Dec 20048 Peer-to-Peer Computing  Primarily used for distributed storage and file-sharing  Early models (rcp, scp, ftp) Restricted to LANs, or Limited to known peers  Internet-based models Centralized (Napster, Kazaa*) Decentralized (Gnutella) *100,000,000 downloads by 2004; 2-million new downloads a week

9 SEEK-BEAM Workshop Dec 20049 Centralized Peer-to-Peer.mp3 ? ? ? ? ? ?

10 SEEK-BEAM Workshop Dec 200410 Decentralized Peer-to-Peer ? ? ? ? ? ?.mp3

11 SEEK-BEAM Workshop Dec 200411 Internet Computing  Volunteer or philanthropic computing; utilizes personal desktop computers connected to the Internet  Desktop computers idle approximately 95% of the their lifespan  Divide and Conqueror approach Tasks broken into smaller subtasks Desktop executes subtasks during idle time Desktop sends data back to central server, which aggregates results

12 SEEK-BEAM Workshop Dec 200412 Synthesis entrée Grid  High-performance computing pioneered the use of “parallel” algorithms  Cluster computing demonstrated the nature of shared computing and storage load balancing protocols  Peer-to-peer computing distributed storage resource with no central authority  Internet computing geographically distributed virtual organization fabric of the project vanishes with completion of the task

13 SEEK-BEAM Workshop Dec 200413 Grid Characteristics  Resources that are connected via a network are geographically distributed may consist of heterogeneous hardware and/or software are managed transparently for performance and fault tolerance  Creates the illusion of virtual organizations and projects without the presence of a central authority, or a central control  Explicit trust relationships between users and resources  A system that scales in space and time

14 SEEK-BEAM Workshop Dec 200414 Types of Resources  Computation utilization of computing cycles found on processors of the machines on the grid  Storage to increase capacity, performance, sharing, and reliability of data  Communication to increase capacity, performance, and reliability of data communication  Collaboration tools to facilitate collaboration through conferencing, visualization, and data sharing  Software and Licenses to share site-specific software and/or licenses  Special equipment, capacities, architectures, and policies printers, imaging, sensors, or other local specialty resources

15 SEEK-BEAM Workshop Dec 200415 Grid Ingredients

16 SEEK-BEAM Workshop Dec 200416 Grid Topologies  Departmental Grids localized to a specific group of people generally, same hardware and software designed for high throughput and high performance over a dedicated network  Enterprise Grids service to numerous groups within a single company or campus resource heterogeneity increases company-wide local area network  Extraprise Grids service to multiple companies, partners, and customers within a particular domain domain based private network  Global Grids established over the public-Internet

17 SEEK-BEAM Workshop Dec 200417 Resource-based Grids  Compute Grids desktop nodes server nodes high-performance computing clusters  Data Grids performance-based distributed storage replication for fault-tolerance  Collaboration Grids support for video-conferencing, visualization and data sharing  Utility Grids maintained and managed by a commercial service provider compute resources acquired on a per-need basis application resources that are purchased on a per-use or per- minute basis

18 SEEK-BEAM Workshop Dec 200418 Application Characteristics  Perfect Parallelism – computations run autonomously (Monte Carlo Simulations)  Data Parallelism – operations performed on data simultaneously (db searches)  Functional Parallelism – multiple operations are performed simultaneously Optimized for parallel execution Not capable of parallel computation Fibonacci Series (1, 1, 2, 3, 5, 8, 13, 21,…) F(k+2) = F(k+1) + F(k)

19 SEEK-BEAM Workshop Dec 200419 Questions to ask? When thinking Grid  Identity and Authentication—Is this user who he says he is? Is this program the right program?  Authorization and Policy—What can the user do on the grid? What can the application do on the grid? What resources are the user and or application allowed to access?  Resource Discovery—Where are the resources?  Resource Characterization—What types of resources are available?  Resource Allocation—What policy is applied when assigning the resources? What is the actual process of assigning the resources. Who gets how much?  Resource Management—Which resource can be used at what time and for what purpose?  Accounting/Billing/Service Level Agreement (SLA)—How much of the resources is being used? What is the rating schedule? What is the SLA?  Security—How do I make sure that this is done securely? How do we know if we have been compromised? What steps are taken once a security breach is detected?

20 SEEK-BEAM Workshop Dec 200420 A Grid Computing Model (the Globus view)  Software stack consisting of Standards Protocols APIs and SDKs  Loosely based on the Internet model

21 SEEK-BEAM Workshop Dec 200421 A detailed view…  Fabric – protocols and interfaces to resource being shared  Connectivity – protocols for grid-specific network transactions (IP, DNS, WSDL); Security implementation (GSI)  Resource – protocols to initiate and control sharing of local resources (GRAM, GridFTP, GRIS)  Collective – protocols for system-wide deployment (versus local)  Application – protocols targeted at a specific application or class of applications

22 SEEK-BEAM Workshop Dec 200422 Grid Protocols  Grid Security Infrastructure (GSI)  Grid Resource Allocation and Management (GRAM)  Grid File Transfer Protocol (GridFTP)  Grid Information Services (GIS)

23 SEEK-BEAM Workshop Dec 200423 Grid Security Infrastructure  Extended from SSL/TLS and X.509 protocols  Utilizes PKI for Certificate Authority Primary objective is “Authorization” Generates primary credential Generates temporary proxy credential  Certificate Authority Positively identify entities requesting certificates Issuing, removing, and archiving certificates Protecting the Certificate Authority server Maintaining a namespace of unique names for certificate owners Serve signed certificates to those needing to authenticate entities Logging activity

24 SEEK-BEAM Workshop Dec 200424 Public Key Infrastructure 1. User A encrypts message with his private key 2. Obtains User B’s public key from CA 3. Encrypts message with B’s public key 4. Sends message 1. User B decrypts message with his private key 2. Obtains User A’s public key from CA 3. Decrypts A’s message with public key 4. B knows message is from A Public Private Public Public Keys “A” “B” Certificate Authority B’s public key A’s public key Authentication Credential

25 SEEK-BEAM Workshop Dec 200425 Grid Security Infrastructure

26 SEEK-BEAM Workshop Dec 200426 Grid Resource Allocation and Management  Allows programs to be started on remote resources  Resource Specification Language (RSL) Resource requirements  machine type, number of nodes, memory, etc… Job configuration  directory, executable, arguments, environment  Communication protocols HTTP-base RPC (early protocol) Web-services (WSDL, SOAP) “create 5-10 instances of myprog, each on a machine with at least 64MB memory that is available to me for 4 hours, or 10 instances, on a machine with at least 32MB of memory”

27 SEEK-BEAM Workshop Dec 200427 Grid File Transfer Protocol  Providing high-speed and reliable transfer of large volume data (petabytes)  Extension of standard FTP to include striped/parallel data channels partial files automatic and manual TCP buffer size settings progress monitoring extended restart functionality

28 SEEK-BEAM Workshop Dec 200428 Grid Information Services  Grid Resource Information Service (GRIS) provides resource specific information  Grid Resource Registration (GRR) updates GRIS about resource status  Grid Index Information Service (GIIS) an aggregate directory service provides a collection of information that has been gathered from multiple GRIS servers  Grid Resource Inquiry (GRI) queries GRIS server for resource information queries GIIS server for information

29 SEEK-BEAM Workshop Dec 200429 Open Grid Services Architecture  Marriage of grid protocols with web service protocols  Specifications for How Grid Services are created and discovered How Grid Service instances are named and referenced Interfaces that define any Grid Service  Initial release with GT 3.0 mid-2003; GT 4.0 Jan 2005

30 SEEK-BEAM Workshop Dec 200430 Grid Examples  Network for Earthquake Engineering and Simulation (NEESGrid)  Biomedical Informatics Research Network (BIRN)  EcoGrid

31 SEEK-BEAM Workshop Dec 200431 NEESGrid (Network for Earthquake Engineering and Simulation)  Linking scientists and facilities observation of an experiment in progress observation before and after an experiment remote operation of an experiment  Linking facilities and data hybrid operation of physical simulations with other simulations, both physical and numerical automatic archiving of raw data, calibration data, and processed data  Linking scientists and data collaborative views (static) of time synchronized data visualizations collaborative views of time synchronized data visualizations with video and audio recordings  Linking scientists and other scientists synchronous communication, such as with colleagues during an experiment asynchronous communication, such as with colleagues over the course of preparing a publication resulting from an experiment

32 SEEK-BEAM Workshop Dec 200432 NEESGrid (Network for Earthquake Engineering and Simulation)

33 SEEK-BEAM Workshop Dec 200433 NEESGrid (Network for Earthquake Engineering and Simulation) Network Architecture Diagram

34 SEEK-BEAM Workshop Dec 200434 BIRN (Biomedical Informatics Research Network)  Testbed for a biomedical knowledge infrastructure  Federated database of neuro-imaging data  Fusion of diverse data sources (location; level of aggregation)  Grid access to computational resources  Datamining software  Scalable and extensible  Driven by research needs, not technology-pull or not technology-push

35 SEEK-BEAM Workshop Dec 200435 BIRN (Biomedical Informatics Research Network)

36 SEEK-BEAM Workshop Dec 200436 BIRN (Biomedical Informatics Research Network)

37 SEEK-BEAM Workshop Dec 200437 EcoGrid  Metadata Standardization Ecological Metadata Language – “EML” Integrate diverse data networks from ecology, biodiversity, and environmental sciences  Standardized interfaces to data resources  Metacat  SRB  DiGIR  Xanthoria  Metadata-mediated data access (application-based) Supports multiple metadata standards EML, Darwin Core as foci  Computational services Pre-defined analytical services On-the-fly analytical services

38 SEEK-BEAM Workshop Dec 200438 EcoGrid *EML facilitates semi-automatic data binding

39 SEEK-BEAM Workshop Dec 200439 EcoGrid

40 SEEK-BEAM Workshop Dec 200440 Grid Organizations  Globus Alliance Globus Toolkit TM – Reference implementation of the grid architecture and grid protocols  NSF Middleware Initiative (NMI) Supports the design, development, testing, and deployment of middleware for HPC  GRIDS Center Grid Research Integration Deployment and Support Center – part of NMI  Global Grid Forum Main standards body governing the world- wide grid community

41 SEEK-BEAM Workshop Dec 200441 Recommended Texts  Grid Computing: A Practical Guide to Technology and Applications Ahmar Abbas Charles River Media © 2004  Introduction to Grid Computing with Globus Luis Ferreira et al. IBM Redbooks © 2004  Enabling Applications for Grid Computing with Globus Bart Jacob et al. IBM Redbooks © 2003  Grid Services Programming and Application Enablement Luis Ferreira et al. IBM Redbooks © 2004

Download ppt "An Introduction to Grid Computing BEAM Workshop December 2004 Mark Servilla LTER Network Office."

Similar presentations

Ads by Google