Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid Computing ECI, July 2005. ECI – July 2005 2 Living in an Exponential World Moore’s Law: transistors count x2 in 18 months Storage density x2 in 12.

Similar presentations


Presentation on theme: "Grid Computing ECI, July 2005. ECI – July 2005 2 Living in an Exponential World Moore’s Law: transistors count x2 in 18 months Storage density x2 in 12."— Presentation transcript:

1 Grid Computing ECI, July 2005

2 ECI – July 2005 2 Living in an Exponential World Moore’s Law: transistors count x2 in 18 months Storage density x2 in 12 months Online data x10 in 12 months (current = 10pB) Telescope to generate > 10pB by 2008 Network speed x2 in 9 months 1986-2000: cpu x500, network x340000 2001-2010: cpu x60, network x4000

3 ECI – July 2005 3 What is a Grid (informal) Three key criteria: Coordinates resources not under centralized control Using standard, open, general purpose protocols and interfaces To deliver non-trivial quality of service What is not a Grid? A cluster, a network attached storage device, a scientific instrument, a network, (though these are important components)

4 ECI – July 2005 4 So… We’ve got: Fast computers (but not fast enough…) Bigger storage (but not big enough…) Fast networks (well, not speedy enough…) And we want to: Solve big computational problems… In that case: How about joining resources together ? That’s GRID!

5 ECI – July 2005 5 Why “Grid” ? Analogy with the Power Grid Service with known characteristics: Stable voltage (~220v) Contracted power Pay the installed capacity and consumed power Standard sockets, outlets, devices Available 24/7 (usually…)

6 ECI – July 2005 6 And in Computers “Computer Grid” similar to “Power Grid” Special socket to get connected Pay subscription and the power consumed If need more – contract more

7 ECI – July 2005 7 Definitions of Grid A paradigm/infrastructure that enables the sharing, selection, & aggregation of geographically distributed resources to solve large scale problems/applications Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations Computers, software, catalogue data and databases, special devices/instruments, people

8 ECI – July 2005 8 What is a Grid (informal) Three key criteria: Coordinates resources not under centralized control Using standard, open, general purpose protocols and interfaces To deliver non-trivial quality of service What is not a Grid? A cluster, a network attached storage device, a scientific instrument, a network, (though these are important components)

9 ECI – July 2005 9 Grid and the Hype The classic Hype curve HERE !

10 ECI – July 2005 10 Types of Grids Grid systems can be classified depending on their usage: Grid Systems Data Grid Computational Grid Services Grid High Throughput Distributed Supercomputing On Demand Collaborative Multimedia

11 ECI – July 2005 11 Types of Grids Computational Grids Distributed Supercomputing: grand challenge apps High-Throughput: parametric modeling, independent tasks Data Grids Data mining, analysis, data processing Service Grids Collaborative: connects users, apps and devices Multimedia: real time multimedia, virtual reality Demand: aggregate more resource if required

12 ECI – July 2005 12 A Typical Grid Computing Environment Grid Resource Broker Resource Broker Application Grid Information Service Grid Resource Broker database R2R2 R3R3 RNRN R1R1 R4R4 R5R5 R6R6 Grid Information Service

13 ECI – July 2005 13 Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services How it Really Happens (A Simplified View) Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service

14 ECI – July 2005 14 How it Really Happens (without Grid Software) Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service A B C D E Application Developer 10 Off the Shelf 12 Globus Toolkit 0 Grid Community 0

15 ECI – July 2005 15 Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services How it Really Happens (with Grid Software) Web Browser Compute Server Globus MCS/RLS Data Viewer Tool Certificate Authority Chat Tool MyProxy CHEF Compute Server Database service Database service Database service Simulation Tool Camera Telepresence Monitor Globus Index Service Globus GRAM OGSA DAI Application Developer 2 Off the Shelf 9 Globus Toolkit 5 Grid Community 3

16 ECI – July 2005 16 Grid Characteristics * Resource Management * Application Construction Entities/IssuesCharacteristics Users, Resources, Owners Geographically Distributed User, Resources, Applications Heterogeneous Resource Availability/Capability Varies with time Policies and strategies Heterogeneous & decentralised QoS requirementsHeterogeneous Cost / PriceVaries: different resources, users, time

17 ECI – July 2005 17 Why is it Complex ? Size (nodes, providers, consumers) Heterogeneity of resources Heterogeneity of fabric management Systems, policies Heterogeneity of applications Type, requirements, patterns Geographic distribution, varying time zones Non-secure and Unreliable environment

18 ECI – July 2005 18 Networked Resources across Organizations Computers NetworksData SourcesScientific InstrumentsStorage Systems Local Resource Managers Operating Systems Queuing Systems Internet Protocols Libraries & App Kernels Distributed Resources Coupling Services InformationQoSProcess Development Environments and Tools Languages/CompilersLibrariesDebuggersWeb tools Resource Management, Selection, and Aggregation (BROKERS) Applications and Portals Prob. Solving Env. Scientific … Collaboration Engineering Web enabled Apps Trading … … … … FABRIC APPLICATIONS SECURITY LAYER Security Data CORE MIDDLEWARE USER LEVEL MIDDLEWARE Monitors Layered Grid Architecture

19 ECI – July 2005 19 Resource/Service Integration as a Fundamental Challenge R Discovery Many sources of data, services, computation R Registries organize services of interest to a community Access Data integration activities may require access to, & exploration/analysis of, data at many locations Exploration & analysis may involve complex, multi-step workflows RM Resource management is needed to ensure progress & arbitrate competing demands Security service Security service Policy service Policy service Security & policy must underlie access & management decisions

20 ECI – July 2005 20 Grid Middleware Technologies Globus – Argonne National Lab and ISI Gridbus – University of Melbourne Unicore – Germany Legion – University of Virginia

21 ECI – July 2005 21 The Globus Toolkit

22 Globus Toolkit Services Security (GSI) PKI-based Security (Authentication) Service Job submission and management (GRAM) Uniform Job Submission Information services (MDS) LDAP-based Information Service Remote file management (GASS) Remote Storage Access Service Remote Data Catalogue and Management Tools

23 ECI – July 2005 23 Security Resources and users belong to organizations An authentication infrastructure is needed Both users and owners should be protected from each other Ensure security and privacy: Data Code Message

24 ECI – July 2005 24 Grid Security Infrastructure (GSI) GSI is: PKI (CAs and Certificates) SSL/ TLS Proxies and Delegation PKI for credentials SSL for Authentication And message protection Proxies and delegation (GSI Extensions) for secure single Sign-on

25 ECI – July 2005 25 Simple job submission globus-job-run provides a simple RSH compatible interface % grid-proxy-init Enter PEM pass phrase: ***** % globus-job-run host program [args] Authentication Test % globusrun –a –r hostname Running a Job on Remote node % globusrun hostname globus-job-run belle.anu.edu.au /bin/dat

26 ECI – July 2005 26 Authorization GSI handles authentication, but not authorization Authorization issues: Management of authorization on a multi- organization grid is still an interesting problem Mapping resources to users does not scale well Large communities that share resources...

27 ECI – July 2005 27 Globus Resource Access Manager Resource Specification Language (RSL) GRAM allows programs to be started on remote resources A layered architecture allows app-specific resource brokers and co-allocators to be defined as services

28 ECI – July 2005 28 GRAM LSFEASY-LLNQE Application RSL Simple ground RSL Information Service Local resource managers RSL specialization Broker Ground RSL Co-allocator Queries & Info Resource Management Architecture

29 ECI – July 2005 29 GRAM Components Globus Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary ClientMDS: Grid Index Info Server Gatekeeper MDS: Grid Resource Info Server Local Resource Manager (e.g., PBS, Condor, or OS-fork()) MDS client API calls to get resource info GRAM client API state change callbacks

30 ECI – July 2005 30 A simple run Interactive Run/Output: > globus-job-run belle.anu.edu.au /bin/date Mon May 3 15:05:42 EST 2004 > globusrun -o -r belle.anu.edu.au "&(executable=/bin/date)" Sun May 22 17:27:22 EST 2005 Batch Commands: > globusrun -b -r belle.anu.edu.au "&(executable=/bin/date)(stdout=MyOutputFile)" > gsincftpget belle.anu.edu.au. MyOutputFile (Pull output file to local directory)

31 ECI – July 2005 31 Resource Specification Language (RSL) Common notation for information exchange Provides two types of information: Resource requirements: machine type, number of nodes, memory, etc. Job configuration: directory, executable, args, environment API provided for manipulating RSL

32 ECI – July 2005 32 RSL Syntax Elementary form: parenthesis clauses (attribute op value [ value … ] ) Operators Supported: =, >, != Some supported attributes: executable, arguments, environment, stdin, stdout, stderr Unknown attributes are passed through May be handled by subsequent tools

33 ECI – July 2005 33 Constraints: “&” globusrun -o -r belle.anu.edu.au "&(executable=/bin/date)" For example: & (count>=5) (count<=10) (max_time=240) (memory>=64) (executable=myprog) “Create 5-10 instances of myprog, each on a machine with at least 64 MB memory that is available to me for 4 hours”

34 ECI – July 2005 34 Running job as batch job globusrun -b -r belle.anu.edu.au '&(executable=/bin/date)(stdout=filename)' It prints a "handle" that you can use to interrogate the job while it is running: https://belle.anu.edu.au:4029/288/1116418550/ Check job status: > globusrun -status https://belle.anu.edu.au:4029/288/1116418550/ Terminate job execution: > globusrun -kill https://belle.anu.edu.au:4029/288/1116418550/

35 ECI – July 2005 35 Disjunction: “|” For example: & (executable=myprog) ( | (&(count=5)(memory>=64)) (&(count=10)(memory>=32))) Create 5 instances of myprog on a machine that has at least 64MB of memory, or 10 instances on a machine with at least 32MB of memory

36 ECI – July 2005 36 Multirequest: “+” A multi-request allows us to specify multiple resource needs, for example + (& (count=5)(memory>=64) (executable=p1)) (&(network=atm) (executable=p2)) Execute 5 instances of p1 on a machine with at least 64M of memory Execute p2 on a machine with an ATM connection Multirequests are central to co-allocation

37 ECI – July 2005 37 Job Submission Interfaces Command line programs for job submission globus-job-run: Interactive jobs globus-job-submit: Batch/offline jobs globusrun: Flexible scripting infrastructure Other High Level Interfaces General purpose Nimrod-G, Condor-G, Gridbus Broker, PBS, etc Application specific Web portals

38 ECI – July 2005 38 globus-job-run For running of interactive jobs Additional functionality beyond rsh Ex: Run 2 process job w/ executable staging globus-job-run -: host –np 2 –s myprog arg1 arg2 Ex: Run 5 processes across 2 hosts globus-job-run \ -: host1 –np 2 –s myprog.linux arg1 \ -: host2 –np 3 –s myprog.aix arg2 For list of arguments run: globus-job-run -help

39 ECI – July 2005 39 globus-job-submit For running of batch/offline jobs globus-job-submitSubmit job Same interface as globus-job-run Returns immediately globus-job-statusCheck job status globus-job-cancelCancel job globus-job-get-outputGet job stdout/err globus-job-cleanCleanup after job

40 ECI – July 2005 40 Simultaneous start co-allocator Information Service “Run SF-Express on 300 nodes” "Run SF-Express on 256 nodes” “Run a distributed interactive simulation involving 100,000 entities” “80 nodes on Argonne SP, 256 nodes on CIT Exemplar 300 nodes on NCSA O2000” “Supercomputers providing 100 GFLOPS, 100 GB, < 100 msec latency” DIS-Specific Broker "..." “Perform a parameter study involving 10,000 separate trials” Parameter study specific broker Supercomputer resource broker NCSA Resource Manager Argonne Resource Manager CIT Resource Manager Resource Brokers "..." “Create a shared virtual space with participants X, Y, and Z” Collaborative environment-specific resource broker "Run SF-Express on 80 nodes”

41 ECI – July 2005 41 Remote I/O and Data Access Tell GRAM to pull executable from remote Access files from a remote location stdin/stdout/stderr from a remote location

42 ECI – July 2005 42 What is GASS? GASS file access API Replace open/close with globus_gass_open/close; read/write calls can then proceed directly RSL extensions URLs used to name executables, stdout, stderr Remote cache management utility Low-level APIs for specialized behaviors

43 ECI – July 2005 43 GASS File Naming URL encoding of resource names https://quad.mcs.anl.gov:9991/~bester/myjob protocol server address file name Other examples https://pitcairn.mcs.anl.gov/tmp/input_dataset.1 https://pitcairn.mcs.anl.gov:2222/./output_data http://www.globus.org/~bester/input_dataset.2 Supports http & https Support ftp & gsiftp.

44 ECI – July 2005 44 Example GASS Applications On-demand, transparent loading of data sets Caching of data sets Automatic staging of code and data to remote supercomputers (Near) real-time logging of application output to remote server

45 ECI – July 2005 45 GASS File Access API Minimum changes to application globus_gass_open(), globus_gass_close() Same as open(), close() but use URLs instead of filenames Caches URL in case of multiple opens Return descriptors to files in local cache or sockets to remote server

46 ECI – July 2005 46 GASS File Access API (cont) Support for different access patterns Read-only (from local cache) Write-only (to local cache) Read-write (to/from local cache) Write-only, append (to remote server)

47 ECI – July 2005 47 1. Derive Contact String 2. Build RSL string 3. Startup GASS server 4. Submit to request 5. Return output jobmanager gatekeeper program GRAM & GASS stdout GASS server 3 4 globus-job-run Host name Contact string 1 RSL string 2 Command Line Args 4 4 5 5 5 5

48 ECI – July 2005 48 Example: A Simple Broker Select machines based on availability Use MDS queries to get current host loads Look at output and figure out what machines to use Generate RSL based on selection globus-job-run -dumprsl can assist Execute globusrun, feeding it the RSL generated in previous step

49 ECI – July 2005 49 GRAM Components Globus Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary ClientMDS: Grid Index Info Server Gatekeeper MDS: Grid Resource Info Server Local Resource Manager (e.g., PBS, Condor, or OS-fork()) MDS client API calls to get resource info GRAM client API state change callbacks

50 ECI – July 2005 50 MDS: Monitoring and Discovery Service General information infrastructure Locate and determine characteristics of resources Locate resources Where are resources with required architecture, installed software, available capacity, network bandwidth, etc.? Determine resource characteristics What are the physical characteristics, connectivity, capabilities of a resource?

51 ECI – July 2005 51 Examples of Useful Information Characteristics of a compute resource IP address, software available, system administrator, networks connected to, OS version, load Characteristics of a network Bandwidth and latency, protocols, logical topology Characteristics of the Globus infrastructure Hosts, resource managers

52 ECI – July 2005 52 MDS Store information in a distributed directories Directory stored in collection of servers Each server optimized for particular function Directory can be updated by Information providers and tools Applications (i.e., users) Backend tools which generate info on demand Information dynamically available to Tools Applications

53 ECI – July 2005 53 Directory Service Functions White Pages Look up the IP number, amount of memory, etc., associated with a particular machine Yellow Pages Find all the computers of a particular class or with a particular property Temporary inconsistencies may be okay A distributed system may be imprecise about the state of a resource, until you actually use it Information is often used as “hints” Information itself can contain ttl, etc

54 ECI – July 2005 54 GRAM Components Globus Security Infrastructure Job Manager GRAM client API calls to request resource allocation and process creation. MDS client API calls to locate resources Query current status of resource Create RSL Library Parse Request Allocate & create processes Process Monitor & control Site boundary ClientMDS: Grid Index Info Server Gatekeeper MDS: Grid Resource Info Server Local Resource Manager MDS client API calls to get resource info GRAM client API state change callbacks

55 ECI – July 2005 55 What users want ? Grid Consumers Execute jobs for solving varying problem size and complexity Benefit by selecting and aggregating resources wisely Tradeoff timeframe and cost minimize expenses Grid Providers Contribute (“idle”) resource for consumer jobs Benefit by maximizing resource utilization Tradeoff local requirements & market opportunity maximize return on investment

56 ECI – July 2005 56 What’s Wrong with Cluster Methods ? They use centralised policy that need complete state-information common fabric management policy or decentralised consensus-based policy. Too many heterogenous parameters define system-wide performance matrix ? define common fabric management policy ? “distributed computational economy” proved successful in human economies can leverage proven economic principles/techniques can regulate demand and supply offers incentive (money?) for being part of the grid!.....

57 ECI – July 2005 57 Grid Economy: “Incentive” as a Design Parameter Grids aim at exploiting synergies that result from cooperation of autonomous distributed entities. Creation of Virtual Organisations/Enterprises Resource sharing Aggregation of resources on demand. For this cooperation to be sustainable, all need to have (economic) incentive. Therefore, “incentive” mechanisms should be considered as one of key design parameters of Grid computing.

58 ECI – July 2005 58 Gridbus Architecture Layer

59 Gridbus and Complementary Grid Technologies AIX Solaris WindowsLinux.NET Grid Fabric Software Grid Applications Core Grid Middleware User-Level Middleware (Grid Tools) Grid Bank Grid Exchange & Federation JVM Grid Brokers: X-Parameter Sweep Lang. Gridbus Data Broker MPI CondorSGETomcatPBS Alchemi Workflow IRIXOSF1 Mac Libra GlobusUnicore … … Grid Market Directory PDBCDB Worldwide Grid Grid Fabric Hardware … … PortalsScienceCommerceEngineering … … Collaboratories … … Workflow Engine Grid Storage Economy Grid Economy NorduGridXGrid ExcellGrid Nimrod-G GRIDSIMGRIDSIM Gridscape

60 ECI – July 2005 60 Putting them All Together: On Demand Assembly of Services Data Source (Instruments/dis tributed sources) Data Replicator (GDMP) ASP Catalogue Grid Info Service Grid Market Directory GSP (Accounting Service) Gridbus GridBank Data GSP (e.g., UofM) PE GSP (e.g., VPAC) PE GSP (e.g., IBM) CPU or PE Grid Service (GS) (Globus) Alchemi GS GTS Cluster Scheduler Grid Service Provider (GSP) (e.g., CERN) PE Cluster Scheduler Job 8 Grid Resource Broker 2 Visual Application Composer Application Code Explore data 1 46 35 Results 97 Results+ Cost Info 10 11 Bill 12 Data Catalogue

61 ECI – July 2005 61 Grid Brokers Perform parameter sweep (bag of tasks) (utilizing distributed resources) within “T” hours or early and cost not exceeding $M. Three Options: Using pure Globus commands Build your own distributed app & scheduler Use Nimrod-G / Gridbus (Resource Broker)

62 ECI – July 2005 62 Remote Execution Steps Choose Resource Transfer Input Files Set Environment Start Process Pass Arguments Monitor Progress Read/Write Intermediate Files Transfer Output Files Summary View Job View Event View GRID

63 ECI – July 2005 63 Scheduling task farming (Data Grid apps) with static or dynamic parameter sweeps Employ computational economy for selection of services, depending on quality, cost, and availability, and users requirements (deadline, budget) A single window to manage & control experiment Programmable task farming engine Resource discovery and resource trading Transportation of data & sharing of results Accounting Grid Service Broker (GSB)

64 ECI – July 2005 64 Example Grid Schedulers Nimrod-G - Monash University Computational Grid & Economic based Condor-G – University of Wisconsin Computational Grid & System centric Gridbus Broker – Melbourne University Data Grid & Economic based

65 ECI – July 2005 65 Key Steps in Grid Scheduling 1. Authorization Filtering 3. Min. Requirement Filtering 2. Application Definition Phase I-Resource Discovery 5. System Selection 4. Information Gathering Phase II - Resource Selection 7. Job Submission 6. Advance Reservation 9. Monitoring Progress 8. Preparation Tasks 11. Clean-up Tasks 10 Job Completion Phase III- Job Execution


Download ppt "Grid Computing ECI, July 2005. ECI – July 2005 2 Living in an Exponential World Moore’s Law: transistors count x2 in 18 months Storage density x2 in 12."

Similar presentations


Ads by Google