Presentation is loading. Please wait.

Presentation is loading. Please wait.

O.Wäldrich 5.10.2004 - 1 CoreGRID Summer School Bonn, July 25, 2006 Resource Orchestration in Grids Wolfgang Ziegler Department of Bioinformatics Fraunhofer.

Similar presentations


Presentation on theme: "O.Wäldrich 5.10.2004 - 1 CoreGRID Summer School Bonn, July 25, 2006 Resource Orchestration in Grids Wolfgang Ziegler Department of Bioinformatics Fraunhofer."— Presentation transcript:

1 O.Wäldrich 5.10.2004 - 1 CoreGRID Summer School Bonn, July 25, 2006 Resource Orchestration in Grids Wolfgang Ziegler Department of Bioinformatics Fraunhofer Institute SCAI

2 O.Wäldrich 5.10.2004 - 2 Outline oWhat is Orchestration and (why) do we need it ? oOverview on existing Brokers & MetaSchedulers -General Architecture -GridWay -EGEE Workload Manager Service -Condor-G -Nimrod & Co -Grid Service Broker -Calana -MP-Synergy -SUN N1 & SGE -MOAB -Platform CSF -KOALA -MetaScheduling Service o Grid Scheduling Architecture -OGF Standardisation Activities -GSA Research Group -OGSA-BES & OGSA-RSS Working Groups -Examples (VIOLA-ISS/PHOSPHORUS/BODEGA/SAROS) oVIOLA MetaScheduling Service oWhat next

3 O.Wäldrich 5.10.2004 - 3 What is Orchestration and (why) do we need it ? oIt is one of the Grid/Web buzz words oHere is a definition for the term: Orchestration or arrangement is the study and practice of arranging music for an orchestra or musical ensemble. In practical terms it consists of deciding which instruments should play which notes in a piece of music. oIn the Web-Service domain the term Orchestration is used for execution of specific business processes using WS-BPEL is a language for defining processes oIn the Grid domain the term Orchestration is used for coordination (usually including reservation) of multiple resources for usage by one single application or a workflow. oNo (de-facto) standards are available so far for description and management, WS-Agreement + JSDL + other term-languages could become one. oThe basic idea is: planning trough negotiation for controlled (better) QoS instead of queuing for best effort

4 O.Wäldrich 5.10.2004 - 4 Scenarios where Orchestration is needed oRunning applications with time constraints oApplications must deliver results at a certain time oApplication may not start before a fixed time oCo-allocation oApplications requiring more than one resource, e.g. odistributed visualisation omulti-physics simulations odata intensive applications requiring more storage than locally available odedicated QoS of network connections for distributed applications oWorkflows oIndividual interdependent components

5 O.Wäldrich 5.10.2004 - 5 Example: What might happen without Orchestration UNICORE Client Local Scheduler UNICORE Gateway TSI Primary NJS TSI NJS UNICORE Gateway TSI NJS Local Scheduler Local Scheduler Job Queue Site ASite B The user describes his Job The Job is passed to the UNICORE System The Primary NJS distributes the Job to all sites The Job is submitted to the local batch-queues of all systems Cluster The components of the Job are started - depending on the state of the local batch-queues The quality of the network connections depends on the actual load

6 O.Wäldrich 5.10.2004 - 6 Constraints of Orchestration oProcess has to respect site autonomy and site policies done through negotiation and use of local RMS/scheduling systems oReservation on behalf of the requesting user Done through mapping to local id of the user oWithout SLA no guarantee at all, with SLA guarantees are in place, bur may be cancelled failure of resources or service providers decision to prefer another, e.g. more profitable, job may cause unforeseen break of contract if one SLA fails, what happens to the other ones? oPenalties agreed upon as part of the SLA can cut one's losses oNeed to Identify acting and responsible parties beforehand e.g. broker, local scheduling system, adapter, other instance of service/resource provider, client side process/component oNeed a tool/service to manage the orchestration - might be either local or remote oLocal RMS/schedulers must provide support for advance reservation

7 O.Wäldrich 5.10.2004 - 7 Full backfill algorithm Estimation of worst case start/stop for each job (preview) Node range specification Start time specification Special resource requirement specification very low priority jobs (Z-jobs) Communication friendly node allocation strategy Portable: available on different parallel machines Graphical user interface Status information available via WEB interface Priority scheme (project, resources, waited time) Reserved slots are fixed and are no longer subject for scheduling Crucial Properties of local Scheduling systems

8 O.Wäldrich 5.10.2004 - 8 Overview on Brokers & MetaSchedulers Possible components/interaction of a scheduling infrastructure of Global Grids (OGF GSA-RG)

9 O.Wäldrich 5.10.2004 - 9 Overview - GridWay Environment: Globus GT2.4, GT4 Features & Scope: - Works on top of multiple local schedulers (LSF, PBS (Open, Pro) SGE, N1) - Supports migration of jobs based on monitoring of job- performance. - Support for self-adaptive applications (modification of requirements and migration request) - Provides the OGF DRMMA API for local systems able to provide the DRMMA bindings (currently SGE and N1) License: Open Source, GPL v2 Support: GridWay on-line support forum

10 O.Wäldrich 5.10.2004 - 10 Overview - EGEE Workload Manager Service Environment: LCG, gLite (Globus GT2.4, GT4) Features & Scope: - Two modes of job-scheduling: - submits jobs through Condor-G pull-mode, the computational Grid takes - the jobs from the queue - Eager or lazy policy for job scheduling: early binding to resources (one job/multiple resources) or matching against one resource becoming free (one resource/multiple jobs) - Works on top of multiple local schedulers (LSF, PBS (Open, Pro) SGE, N1) - Supports definition of workflows with specification of dependencies - Support for VOs, accounting License: Open Source license Support: mailing lists and bug reporting tools

11 O.Wäldrich 5.10.2004 - 11 Overview - Condor-G Environment: Globus GT2.4 - GT4, UNICORE, NorduGrid Features & Scope: - Fault tolerant job-submission system, supports Condors ClassAD match-making for resource selection - Can submit jobs to local scheduling systems (Condor, PBS (Open, Pro) SGE, N1) - Supports workflow interdependency specification through DAGman - Allows query of job status and provides callback mechanisms for termination or problems - Provides the OGF DRMMA API for local systems with DRMMA bindings License: Open Source Support: free (mailing list) & fee-based (telephone, email)

12 O.Wäldrich 5.10.2004 - 12 Overview - Nimrod/G & Co Environment: Globus, Legion, Condor Features & Scope: - Focused on parametric experiments Collaborates with multiple local schedulers (LSF, PBS (Open, Pro), SGE) - Follows an economic approach based on auctioning mechanisms including resource providers and resource consumers - API to write user-defined scheduling policies License: Royalty free license for non-commercial use. Enfuzion is a commercially licensed variant provided by Axceleon. Support: Limited support by email

13 O.Wäldrich 5.10.2004 - 13 Overview - Grid Service Broker Environment: Globus GT2.4, GT4, UNICORE in preparation Features & Scope: - MetaScheduler supporting multiple heterogeneous local schedulers (Condor, PBS (Open, Pro), SGE) from the GridBus project - Interface to local systems either SSH or GRAM - Supports integration of user-defined custom scheduler - Supports scheduling based on deadline and budget constraints License: GNU Lesser General Public License

14 O.Wäldrich 5.10.2004 - 14 Overview - Calana Environment: Globus Toolkit 4, UNICORE Features & Scope: - Agent based MetaScheduler for research and commercial environments. - Follows an economic approach based on auctioning mechanisms including resource providers (auctioneer) and resource consumers (bidders). - Collaboration with different local resources possible through implementation of appropriate agents. License: Research prototype used in the Fraunhofer Resource Grid

15 O.Wäldrich 5.10.2004 - 15 Overview - MP Synergie Environment: Relies on Globus Toolkit 2.4. Enterprise Grids Features & Scope: - Scheduling decisions based on various parameters, e.g. load of the local scheduling systems, data transfer time - Accounting Mechanism - Supports other local schedulers (LSF, PBS (Open/Pro), UD GridMP, SGE, Loadleveler, Condor) - Job submission may be respect availability of e.g. licenses License: Proprietary commercial License Support: Paid support by United Devices

16 O.Wäldrich 5.10.2004 - 16 Overview - SUN N1 & SGE Environment: Stand alone, can optionally be integrated with other Grid middleware. Features & Scope: - Allows exchange of the built-in scheduler with a user-provided scheduler - Supports advance reservation if the built-in scheduler is replaced by an appropriate scheduler License: Proprietary commercial License for N1, SGE is Open Source Support: Paid support by SUN for N1

17 O.Wäldrich 5.10.2004 - 17 Overview - MOAB Grid Scheduler Environment: Stand alone, can optionally rely on Globus Toolkit middleware for security or and user account management. Enterprise Grids. Features & Scope: - Bundle of MOAB, Torque, and MOAB workload manager builds a complete stack for computational Grids - Supports other local schedulers (LSF, OpenPBS, SGE, N1) - Supports advance reservation, query of reservations of various resources, e.g. hosts, software licenses, network bandwidth - Local schedulers must support advance reservation License: Proprietary commercial License Maui Grid Cluster Scheduler (limited variant) available on a specific Open Source license Support: Paid support by Cluster Resources Inc.

18 O.Wäldrich 5.10.2004 - 18 Overview - Platform CSF & CSF Plus Environment: Globus GT4 Features & Scope: - Coordinates communication among multiple heterogeneous local schedulers (LSF, OpenPBS, SGE, N1) - Supports advance reservation, query of reservations of various resources, e.g. hosts, software licenses, network bandwidth - Local schedulers must support advance reservation - API to write user-defined scheduling policies License: Open Source Support: Paid support by Platform and other companies

19 O.Wäldrich 5.10.2004 - 19 Overview - KOALA Environment: Globus Toolkit Features & Scope: - MetaScheduler of the Dutch DAS-2 multicluster system. Supports co-allocation of compute and disk resources. -Collaboration with the local schedulers openPBS, SGE. - Local schedulers must support advance reservation - Support for MPI-Jobs (mpich-g2) License:

20 O.Wäldrich 5.10.2004 - 20 Overview - MetaScheduling Service Environment: UNICORE, GT4 in preparation Features & Scope: - MetaScheduling Web-Service supporting reservation, co- allocation and SLAs between the MetaScheduler and the client. - Collaboration with different local schedulers through adapters (EASY, PBS (open, Pro); SGE in preparation - Local schedulers must support advance reservation - Supports orchestration of arbitrary resources, e.g. compute resources and network; storage and licenses in preparation - multiple MSS may be organised hierarchically - Support for MPI-jobs (MetaMpich) - Support for workflows under work - End-to-end SLAs between service provider and service consumer in the next version License: First version used in the VIOLA Grid testbed available for collaborating partners. Support: by email and bug reporting tools

21 O.Wäldrich 5.10.2004 - 21 Grid Scheduling Architectures (1) Integrating Calana with other Schedulers Workload Manager Broker PhastGrid Agent PhastGrid Resource Adapter Agent Other Scheduler Adapter Broker PhastGrid Agent PhastGrid Resource Unicore Agent Source Scheduler Calana submits jobs to another scheduler Another scheduler submits job to Calana

22 O.Wäldrich 5.10.2004 - 22 Grid Scheduling Architectures (2) GridWay Federation of Grid Infrastructures with GridGateWays (Grid4Utility Project) GLOBUS GRID GridWay Execution Drivers Transfer Drivers Information Drivers Scheduling Module GridWay core CLI & API USER 1USER m GLOBUS GRID INFRASTRUCTURE B (VO B) RESOURCE 1 SGE Cluster RFTMDSGRAM RESOURCE 2 PBS Cluster RFTMDSGRAM RESOURCE n LSF Cluster RFTMDSGRAM USER nUSER s VO META-SCHEDULER GLOBUS GRID INFRASTRUCTURE A (VO A) RESOURCE 1 SGE Cluster RFTMDSGRAM RESOURCE 2 PBS Cluster RFTMDSGRAM RESOURCE n LSF Cluster RFTMDSGRAM Execution Drivers Transfer Drivers Information Drivers Scheduling Module GridWay core CLI & API USER 1USER m VO META-SCHEDULER RFTMDSGRAM globus-job-run, Condor/G, Nimrod/G … GridGateWay Standard protocols & interfaces (GT GRAM, OGSA BES…)

23 O.Wäldrich 5.10.2004 - 23 Grid Scheduling Architectures (3) Viola MetaScheduling Service Multi-level MetaScheduling

24 O.Wäldrich 5.10.2004 - 24 Open Grid Forum Standardisation Activities (1) oGrid Scheduling Architecture Research Group (GSA-RG) oAddressing the definition a scheduling architecture supporting all kind of resources, ointeraction between resource management and data management, oco-allocation and the reservation of resources, including the integration of user or provider defined scheduling policies. oTwo sub-groups of the Open Grid Service Architecture Working Group: oBasic Execution Service working Group (OGSA-BES-WG) oOGSA-Resource Selection Services Working Group (OGSA-RSS-WG) owill provide protocols and interface definitions for the Selection Services portion of the Execution Management Services (components: CSG and EPS) oGrid Resource Allocation Agreement Protocol Working Group (GRAAP-WG) oAddressing proposed recommendation for Service Level Agreements oWS-Agreement template and protocol oAllows definition of Guarantee Terms, e.g. SLOs, Business Values, KPI, Penalties

25 O.Wäldrich 5.10.2004 - 25 Open Grid Forum Standardisation Activities (2) oExecution Management Services (OGSA-EMS-WG) (focusing on creation of jobs) oSubset: Basic Execution Service working Group (OGSA-BES-WG)

26 O.Wäldrich 5.10.2004 - 26 Resource Pre-selection oResource pre-selection necessary to reduce the number resources/service providers to negotiate with oRSS can exploit multiple criteria, e.g. oUser/Application supplied selection criteria oOrchestration Service focuses on negotiation, reservation and resulting SLAs oFinal selection of resources from the set provided by the RSS, e.g. oAvailability of resources oCosts depending on possible reservation times or computing environment oCosts caused by delay Monitoring data from Grid monitoring services oOngoing or planned national/EU projects oVIOLA-ISS (pre-selection based on monitoring data of previous application runs) oPHOSPHORUS / BODEGA (pre-selection based on semantic annotation of applications) oSAROS (pre-selection based on actual Grid monitoring data)

27 O.Wäldrich 5.10.2004 - 27 VIOLA MetaScheduling Service oDeveloped in a German Project for the evaluation of the Next Generation of NREN oFocus on Co-allocation and support for MPI-Applications oCompute Resources (nodes of different geographically dispersed clusters) oEnd-to-End network bandwidth between cluster nodes oImplements WS-Agreement for SLAs oNegotiation Protocol will be incorporated OGF-draft (WS-Negotiation, extending WS-Agreement protocol)

28 O.Wäldrich 5.10.2004 - 28 Allocation Agreement Protocol

29 O.Wäldrich 5.10.2004 - 29 Site n Local Scheduler Site n Adapter HTTPS Partial job n MetaScheduler - Integration of local Schedulers Negotiation of timeslot & nodes with local schedulers for each job UNICORE initiates the reservation and submits the job-data UNICORE Client / MetaScheduler Service interface using WS-Agreement protocol Interface MetaScheduler / Adapters based on HTTPS/XML (SOAP) Interface between MetaScheduler Service and local RMS implemented with adapter pattern Authentication and Communication of Adapter and local Scheduler with ssh MetaScheduler HTTPS / XML … Site 1 HTTPS Local Scheduler Site 1 Adapter Partial job 1 Network Network RMS Adapter Switch/Router HTTPS / XML UNICORE Submission of job data WS-Agreement HTTPS

30 O.Wäldrich 5.10.2004 - 30 Example: What happens with SLA UNICORE Client Local Scheduler UNICORE Gateway TSI Primary NJS TSI NJS UNICORE Gateway TSI NJS Local Scheduler Local Scheduler Adapter Job Queue Adapter Job Queue MetaScheduler Site ASite B Network RMS ARGON Link Usage The user describes his Job MetaScheduling Request (WS- Agreement template) MetaScheduler Response (WS-Agreement) Adapter Negotiations and Reservations Cluster The Job is passed to the UNICORE System All Components of the Job are started at the point in time agreed upon, at the same time the network connections are switched on

31 O.Wäldrich 5.10.2004 - 31 MetaScheduler – Running Jobs UNICORE generates UNICORE Wrapper with Job Data Local adapter generates local wrapper for the MetaScheduler and for the execution of the UNICORE Job Local Adapter submits Job with MetaScheduler Wrapper Local Scheduler generates Wrapper for the Execution of the MetaScheduler Wrapper Local Wrapper MetaScheduler Wrapper UNICORE Wrapper Lokale Site Local Scheduler Adapter Request for MetaScheduling MetaScheduler Wrapper Local Wrapper UNICORE Wrapper Generation Reservation UNICORE: Submission of Job Data and Generation of Job Wrapper Generation and Execution

32 O.Wäldrich 5.10.2004 - 32 Network Resource Management System MetaScheduler Local Scheduler Network Resource Manager 1.Submit of Reservation with QoS specification and acknowledgement of reservation 2.Bind of IP-Addresses at run-time 1.) Reservation of required Resources Submit of a Reservation to the Network Resource Manager Acknowledgement of Reservation 2.) Bind of IP-Addresses at Run-time IP-Addresses are published at run-time of the job through the local Adapter Bind of the IP-Addresses by the Network Resource Manager Without explicit Bind the QoS Parameters for the Site-to-Site Interconnection are used Site n Site B Netz R R R Site A 2 GB/s 1 GB/s

33 O.Wäldrich 5.10.2004 - 33 Network Resource Manager – Supported Features available bandwidth t book ahead time current time reservations Immediate Reservation / Advance Reservation Reservation: within the book ahead-Timeframe (i.e. the timeframe the system manages reservations in future) Class: Determines the QoS-Level Network User: id of user the QoS shall be guaranteed Data for a Reservation: Job ID, Start-time, Duration, Network user List of 3-Tupels {Start-/Endpoint, Class}

34 O.Wäldrich 5.10.2004 - 34 Network Resource Manager Application Interface Network Resource Manager RessourceAvailableAt SubmitCancel Status Necessary Functions: ResourceAvailableAt (Preview) Returns time slots when a Resource (End-to-end connection with QoS Level) will be available Submit Start-time, Duration, Class, Start-/End-pointt (Site), User, Returns a Resource Identifier (RESID) Cancel Resource Manager frees the Resources attached to Resource Identifier (RESID) Status Returns state of a connection (submitted, active, released, Class, start-time, end-time, user, etc.) Bind Binding of IP-Addresses of nodes Bind

35 O.Wäldrich 5.10.2004 - 35 MetaScheduler – UNICORE Client Interface

36 O.Wäldrich 5.10.2004 - 36 What Next ? Grid - SOA convergence Supporting resources as services Composition of small services need agile and lightweight orchestration services Timing problems with dynamic apps e.g. when doing parallel IO an demand Currently high latency Full support for workflows based on which description language? Semantic support GSO: scheduling ontology for automatic determination of scheduler capabilities and selection of appropriate one

37 O.Wäldrich 5.10.2004 - 37 Acknowledgements Some of the work presented in this lecture is funded by the German Federal Ministry of Education and Research through the VIOLA project under grant #01AK605F. This presentation also includes work carried out jointly within the CoreGRID Network of Excellence funded by the European Commissions IST programme under grant #004265.


Download ppt "O.Wäldrich 5.10.2004 - 1 CoreGRID Summer School Bonn, July 25, 2006 Resource Orchestration in Grids Wolfgang Ziegler Department of Bioinformatics Fraunhofer."

Similar presentations


Ads by Google