Download presentation
Presentation is loading. Please wait.
Published byRandy Havis Modified over 9 years ago
1
Grid Computing Hakan ÜNLÜ CMPE 511 Presentation Fall 2004
2
Overview General Introduction to Grid Computing Introduction: Why Grids? Applications for Grids Basic Grid Architecture Grid Platforms & Standarts Issues in Grid Computing Hardware: Blade Computers System Management : Globus Toolkit Software: Scheduling
3
What is Grid Computing? Computational and Networking Infrastructure that is designed to provide pervasive, uniform and reliable access to data, computational and human resources distributed over wide area environments
4
Grids Are By Definition Heterogeneous It’s about legacy resources, infrastructure, applications, policies, and procedures The grid and its administrators must integrate in stealth mode…with Firewalls Filesystems Queuing systems Grumpy systems administrators Tried and true applications
5
A Grid Example
6
Challenges in Grid Computing Reliable performance Trust relationships between multiple security domains Deployment and maintenance of grid middleware across hundreds or thousands of nodes Access to data across WAN’s Access to state information of remote processes Workflow / dependency management Distributed software and license management Accounting and billing
7
Applications for a Grid Generally, apps that work well on clusters can work well on grids Non-interactive / batch jobs Parallel computations with minimal interprocess communication and workflow dependencies Reasonable data transfer requirements Sensible economics Productivity Gains > Cost of Building Grid + Opportunity Costs of Resources
8
Non-Interactive / Batch Jobs Difficult to get a real-time UI for jobs running on the grid A possible interactive application: spreadsheet computation Want to take advantage of off-peak free cycles Jobs run for several days, weeks or months The user might prefer to be sleeping while the job runs! Running processes might need to be interrupted or re-prioritized based on the current load on a grid compute engine Idle thread / “screensaver” computing
9
Seti@Home
10
Parallel Computations Application needs to be able to run as multiple, mostly independent pieces Can’t depend on the network’s Quality of Service Can’t rely upon the order of execution and completion Apps that need these things are better suited for tightly coupled compute platforms (e.g. SMP systems) Grid can still be useful as a meta-scheduler and data source for such apps e.g. the user submits the job to the grid queue and asks for the best available SMP resource
11
Some Costs and Benefits Costs: Grid Middleware Architects and Developers User Training Infrastructure Hardware Opportunity Costs Would a big SMP box return better results for your problem? Benefits: Better Utilization of Existing Capital Resources More Efficient Users Ability to complete more work in the same amount of time Performance near or sometimes as good as the big SMP box
12
Basic Grid Architecture Clusters and how grids are different than clusters Departmental Grid Model Enterprise Grid Model Global Grid Model
13
What Makes a Cluster a Cluster? Uses a Distributed Resource Manager (DRM) to manager job scheduling Tightly coupled - High speed, low latency interconnect network Fairly homogenous - Configuration management is important! Single administrative domain
14
The Cluster Model RDPM3ADMMP Operating System StorageCompute Cluster DRM RDPM3ADMMP Operating System StorageCompute Cluster DRM RDPM3ADMMP Operating System StorageCompute Cluster DRM RDPM3ADMMP Operating System StorageCompute Cluster DRM RDPM3ADMMP User Interface/API Cluster DRM Cluster Node High Speed Interconnect Master Node Shared Storage Configuration Management
15
How is an Enterprise Grid Different from a Cluster? Heterogeneous - Clusters, SMP, even workstations of dissimilar configurations, but all are tied together through a grid middleware layer Lightly coupled - Connected via 100 or 1000Mbps Ethernet Introduces a resource registry and grid security service But usually only a single registry and security service for the grid Not necessarily a single administrative domain
16
The Enterprise Grid Model RDPMAADMMP Operating System StorageCompute Cluster Interface RDPMAADMMP Operating System StorageCompute Cluster Interface RDPMAADMMP Operating System StorageCompute Cluster Interface RDPM3ADMMP Operating System StorageCompute Grid Interface RDPM3ADMMP Operating System StorageCompute Grid Interface RDPM3ADMMP User Interface/API Grid Interface SMP Enterprise LAN or WAN Security Infrastructure Resource Registry Grid Interface Cluster DRM RDPMAADMMP Operating System StorageCompute Cluster Interface RDPMAADMMP Operating System StorageCompute Cluster Interface RDPMAADMMP Operating System StorageCompute Cluster Interface Grid Interface Cluster DRM RDPM3ADMMPRDPM3ADMMP
17
How is a Global Grid Different from an Enterprise Grid? "Grid of Grids" - Collection of enterprise grids Loosely coupled between sites - Not much control over Quality of Service Mutually distrustful administrative domains Multiple grid resource registries and grid security services
18
The Global Grid Model Grid WAN RRSI Cluster Grid SMP Grid SMP Grid Cluster UI/API Grid LAN Grid RRSI SMP Grid SMP Grid SMP Grid Cluster RRSI ClusterSMP Grid Cluster Grid LAN Site A Site B Site C UI/API Grid UI/API Grid LAN
19
Grid Platforms & Standards The Global Grid Forum http://www.gridforum.org/ Globus Toolkit DCML (Data Center Markup Language)
20
Globus Toolkit V2 “Pillars” Information Services (MDS) Data Management (GASS) Resource Management (GRAM) Grid Security Infrastructure (GSI)
21
Globus Toolkit V2 Stack MDSGASS/GridFTPGRAM GSI HTTPLDAPFTP TLS/SSL TCP/IP
22
Globus Toolkit V2 Key Components: GRAM, MDS and GASS Grid Resource Allocation Manager (GRAM) Server-side: “gatekeeper” process that controls execution of job managers Client-side: “globusrun” UI to launch jobs Monitoring and Directory Service (MDS) GRIS: Grid Resource Information Service collects local info GIIS: Grid Index Information Service collects GRIS info Global Access to Secondary Storage (GASS) GridFTP, implemented through “in.ftpd” daemon and “globus-url-copy” command Files accessed through a URI, e.g. gsiftp://node1.ncbiogrid.org/data/ncbi/ecoli.nt
23
Globus Toolkit V2 Additional Components Grid Packaging Tools (GPT) Used to build (“gpt-build”), install (“gpt- install”) and localize (“gpt-postinstall”) Globus components MPICH-G2 A Globus V2 enabled version of MPI (Message Passing Interface) Based on MPICH Utilizes GSI, MDS and GRAM
24
Globus Toolkit V2 Network Services Certificate Authority GIIS Server GRIS gatekeeper in.ftpd Grid Node GRAM Client Client Node GRIS gatekeeper in.ftpd Grid Node GRIS gatekeeper in.ftpd Grid Node GRIS gatekeeper in.ftpd Grid Node Network
25
GRAM, MDS and GASS Interactions resource process job manager gatekeeper process GRAM GRIS resource GIIS MDS GridFTP in.ftpd GASS job allocation job management resource discovery data transfer data control user / proxy Client RSL/DUROC/HTTP 1.1LDAP gsiftp
26
Globus Toolkit V2 Strengths and Weaknesses Strengths: Mindshare and collaboration in both industry & academia Open source Standards-based underpinnings (e.g. SSL, LDAP) Flexibility and CoG API's Driving OGSA with heavy resource commitment from IBM Weaknesses: Significant effort required to get applications working on a grid Not production quality at this time No “metascheduler” -- user has to explicitly tell their jobs where to run
27
Issues in Grid Computing Hardware : Blades
28
Hardware Trends HW Trends that enable Grids and Distributed Processing There is a lot of idle computing power Computers are now better connected There are many different brands and configurations in any environment And Distributed Computing that give rise to new HW architectures Blade Computers
29
What is a blade? Inclusive chassis-based modular computing system that includes processors, memory, network interface cards and local storage on a single board. Blade Blade Chasis & Blades Blade Farm
30
Anatomy of a blade
31
How far it can go?
32
Advantages & Disadvantages Low Cost (power, heat, data center space) Physical Server Consolidation (Save space, eliminate cables) High Availability Integrated Systems Management Not suitable in small numbers Need for standardization (for network connection and management)
33
Blades & Grid Each blade is a server that can run jobs. Blades can be used to form clusters or grids. With efficient management different configurations of blades can be used in a single grid computer. Easy to expand Protects investment
34
Issues in Grid Computing System Management : Globus Toolkit
35
Globus Toolkit V2 “Pillars” Information Services (MDS) Data Management (GASS) Resource Management (GRAM) Grid Security Infrastructure (GSI)
36
Globus Toolkit V2 Stack MDSGASS/GridFTPGRAM GSI HTTPLDAPFTP TLS/SSL TCP/IP
37
Globus Toolkit V2 Key Components: GRAM, MDS and GASS Grid Resource Allocation Manager (GRAM) Server-side: “gatekeeper” process that controls execution of job managers Client-side: “globusrun” UI to launch jobs Monitoring and Directory Service (MDS) GRIS: Grid Resource Information Service collects local info GIIS: Grid Index Information Service collects GRIS info Global Access to Secondary Storage (GASS) GridFTP, implemented through “in.ftpd” daemon and “globus-url-copy” command Files accessed through a URI, e.g. gsiftp://node1.ncbiogrid.org/data/ncbi/ecoli.nt
38
Globus Toolkit V2 Additional Components Grid Packaging Tools (GPT) Used to build (“gpt-build”), install (“gpt- install”) and localize (“gpt-postinstall”) Globus components MPICH-G2 A Globus V2 enabled version of MPI (Message Passing Interface) Based on MPICH Utilizes GSI, MDS and GRAM
39
Globus Toolkit V2 Network Services Certificate Authority GIIS Server GRIS gatekeeper in.ftpd Grid Node GRAM Client Client Node GRIS gatekeeper in.ftpd Grid Node GRIS gatekeeper in.ftpd Grid Node GRIS gatekeeper in.ftpd Grid Node Network
40
GRAM, MDS and GASS Interactions resource process job manager gatekeeper process GRAM GRIS resource GIIS MDS GridFTP in.ftpd GASS job allocation job management resource discovery data transfer data control user / proxy Client RSL/DUROC/HTTP 1.1LDAP gsiftp
41
Globus Toolkit V2 Strengths and Weaknesses Strengths: Mindshare and collaboration in both industry & academia Open source Standards-based underpinnings (e.g. SSL, LDAP) Flexibility and CoG API's Driving OGSA with heavy resource commitment from IBM Weaknesses: Significant effort required to get applications working on a grid Not production quality at this time No “metascheduler” -- user has to explicitly tell their jobs where to run
42
Issues in Grid Computing Software : Scheduling
43
Superscheduling Superscheduling means scheduling resources in multiple administrative domains. Various models Submiting a job to a specific single machine Submiting a job to single machines at multiple sites (With cancellation option) Scheduling a single job to use multiple resources Most common superscheduler : USERS
44
Phases Of Superscheduling Resource Discovery Authorisation Filtering Application Requirement Definition Minimal Requirement Filtering System Selection Gathering Information (Query) Select Systems to run on Run the Job Make an Advance Reservation (Optional) Submit Job to Resources Preperation Tasks Monitor Progress Job Completion Completion Tasks Source : Global Grid Forum, Scheduling Working Group, 10 Actions When Scheduling, Schopf, 2001
45
Scheduling Framework (Ranganathan & Foster 2003) External Scheduler Local Scheduler Dataset Scheduler
46
Scheduling And Replication Algorithms External Scheduler JobRandom JobLeastLoaded JobDataPresent JobLocal Dataset Scheduler DataDoNothing: No Active Replitication. Everything is on demand DataRandom: Popular Datasets are replicated to Random Sites DataLeastLoaded: Popular Datasets are snet to the least loaded sites.
47
Simulation Results Average Response TimesAverage Data Transfered
48
Grid Computing Thank You and Questions?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.