Presentation on theme: "CoreGrid Summer SchoolBudapest, Hungary, 3-7 September, 20071 Grid Interoperability Issues in Resource Management: Questions and Solutions Attila Kertész."— Presentation transcript:
CoreGrid Summer SchoolBudapest, Hungary, 3-7 September, Grid Interoperability Issues in Resource Management: Questions and Solutions Attila Kertész MTA SZTAKI CoreGRID Institute on Resource Management and Scheduling
2 Overview Introduction: Heterogeneity in Grids -> Need for Interoperability Solutions for Grid Interoperability: It can be targeted in different levels of Grid Systems Regarding Resource Management, we see 3 approaches: Extending current Resource Management Systems Interfacing RMSs from portals Developing a higher level mediator to utilize RMSs Conclusions and future directions
3 Current situation and trends in Grid Computing Fast evolution of Grid systems and middleware: Globus Toolkit (GT2->3->4), EGEE (LCG-2->gLite), UNICORE, … Many production Grid systems are built with them: EGEE (LCG-2 gLite), UK NGS (GT2), Open Science Grid (GT2 GT4), NorduGrid (~GT2) Although the same set of core services are available everywhere, they are implemented in different ways: Certificate management, Job submission, File management
4 Grid Utilization
5 How to achieve Grid Interoperability? Operating Systems Grid Middleware Higher level services 1. level 2. level 3. level GRID architecture
6 Which levels should we target? At the 1-2. level, establishing interoperability would be the smartest, but also the hardest. The 3. level is the most preferable, since it requires the less modifications to the major architecture.
7 How can we use existing Resource Managers for Grid Interoperability? Three possible directions in the resource management level of current grids: I. Enable Resource Brokers to access resources of different Grids II. Interface different brokers from Portals III. Enable communication among Resource Brokers, or coordinate them by a higher-level tool
8 I. Extending Current RMSs The most obvious way to provide interoperability among different grid systems is to extend the existing and widely used Resource Brokers with multiple grid middleware support. This approach has several advantages and disadvantages, too: Probably this modification would favor the users most, since they would not need to change their customs, submission methods. But from the other point of view, it requires high efforts by the developers to interface new middleware services, so it is definitely a time consuming solution. Nevertheless the more system the broker supports, the more robust and unmanageable it becomes.
9 Related works The Gridbus Grid Service Broker is designed for computational and data-grid applications. Although it supports all Globus middleware, Unicore, Nordugrid and it provides an interface to be implemented for other middleware support, it is mainly used in Globus grids. Gridway is being developed in a Globus incubation project, therefore it supports all Globus versions and it also supports the EGEE middleware. JSS is a decentralized resource broker that is able to utilize both GT4 and NorduGrid resources. The UniGrids (GRIP) project aims at supporting interoperability with a semantic matching of the resource descriptions enabling job submissions to Globus and Unicore sites.
10 Demonstration: GTbroker The first widespread and stable grid middleware was the Globus Toolkit 2. Since it lacked a Resource Broker, we developed a tool called GTbroker. It uses GT2 C API functions to interact with Globus resources and perform job submissions. For determining the available hosts in the grid it queries the MDS. The job submission to resources is done through GRAM, and a GASS server is used to put the files needed for the job to the remote host and to get back the result files if there are any. These tools enable this broker to work without additional software on Globus grids (GT2, GT3 and pre-ws GT4). Since most of the current production grids use this kind of middleware, its simply adaptation made this broker relevant.
11 Extension to EGEE middleware To extend an RMS to support other types of middleware, we need to learn, how to interact with the new system. Brokers need to gather resource information, move files, perform job submissions, track job states and retrieve output files. Most of these activities need interaction with different middleware services. GTbroker was redesigned to support the LCG-2 (EGEE) middleware, by modifying the following parts: information querry to be able to gather data from the BDII, and adding special attributes to the RSL to enable job submission in EGEE VOs. Since the file movement, job description and job state tracking can also be done through the same Globus services in LCG-2 grids, we did not modify these parts (nevertheless for an entirely different middleware we should have done it).
12 VOCE (LCG-2) NGS (GT2) SEEGRID (LCG-2) GTbroker Austrian Grid (GT4) User Portal First step towards Grid Interoperability
13 Comparison tests To prove usability we evaluated broker usage on LCG-2 Grids (VOCE, SEEGRID) The brokers were invoked by scripts: multiple invocation state checking, log gathering output staging back for LCG2 broker We performed the tests in 4 phases varying job types and the number of jobs started at the same time
14 LCG-2 broker usage In EGEE the Workload Management System is responsible for brokering Job properties in JDL, resource information from BDII, job states from Logging and Bookkeeping Default matchmaking: Only Production state resources are taken from BDII The rank is the response time in resource selection
15 GTbroker features Quality of Service features: GTbroker uses an extended RSL file that should contain the user requirements and job properties. Regarding information systems: in Globus grids it queries the MDS, in LCG-2 grids the BDII. During matchmaking a ranked list is created from the found resources in the BDII. Fault tolerance is supported by resubmissions. Should a job fail or be pending for too long on a resource (this time interval can be set in the broker), the broker cancels and resubmits it to another high priority one.
16 Test Phases 1. phase: 20 small single and MPI jobs to VOCE 2. phase: min jobs to both VOs, min MPI jobs to SEEGRID 3. phase: min jobs to SEEGRID, 20 at a time, 5 min intervals 4. phase: 60 ~15 min jobs to SEEGRID, 10 at a time, 4 min intervals
17 2. phase results
18 3. phase results
19 All phase results
20 Test summary Sometimes the LCG-2 broker selected long responding or even non-responding resources, its resubmission not always worked GTbroker made reliable resubmissions and the hidden non-responding or draining resources were skipped For jobs with short running time GTbroker produced better results, for larger jobs they performed about the same results, but GTbroker was more reliable As GTbroker has an eager matchmaking, it usually takes the major part of the jobs to the same (best) resource The user can set a random selection within a range of resources, but this can draw back the performance
21 I. Conclusions We have shown, how additional middleware support can be achieved by redesigning an existing Resource Broker The results prove that existing resource brokers can be extended to use other middleware systems, but in this way developers need to redesign the system to support services of the additional middleware.
22 II. Multi-broker Utilization To exploit the advantages of various brokers and grids at the same time, we need to use more grid Resource Management Systems. In this situation we need to learn various job specification languages and broker capabilities. Grid portals are the currently available tools, which try to hide the details of low level middleware utilization by providing a transparent, uniform interface. In this kind of grid utilization we do not expect grid broker to support more middleware, but to do their best on their own ones.
23 Related works The well known related works are Pegasus, GridFlow, K- Wf grid portal and SPA portal of the HPC-Europa Project. Though the first 3 examples provide high-level access to grid services, they usually operate only on one middleware. The SPA is a portal component that enables brokers to be utilized through plug-in interfaces. These interface methods need to be used by all brokers, providing the same abstract functionality; therefore during an integration the broker would also have to be modified. Only the P-GRADE Portal supports the execution of multi-grid workflows in both Globus-, and EGEE-based production Grids.
24 Demonstration: The P-GRADE Portal General purpose, workflow-oriented computational Grid portal Supports the development and execution of workflow-based Grid applications Based on GridSphere-2 Easy to expand with new portlets (e.g. application- specific portlets) Easy to tailor to end-user needs Support for multi-grid workflows
25 What is a P-GRADE Portal workflow? a Directed Acyclic Graph,where Nodes represent jobs (batch programs to be executed on a computing element) Ports represent input/output files the jobs expect/produce Arcs represent file transfer operations semantics of the workflow: A job can be executed if all of its input files are available
26 Defining broker jobs The user can choose a broker for the job No resource should be selected! Further requirements can be specified by job description editors, which have similar interfaces
27 JDL and RSL Editor Additional job-related requirements can be set in job description editors: JDL Editor: Creates a JDL file for the WMS The user can set JDL attributes such as: Rank and Requirements, Environment variables, … RSL Editor: Creates an RSL file Basic and special RSL attributes can be set such as: random resource, skip time…
28 Workflow Execution P-GRADE Portal contains a DAGMan-based workflow manager subsystem DAGMan degrades workflows into elementary file transfer and job submission tasks, and schedules these tasks according to their dependencies The submission is done by its pre/post scripts: When a broker is used for job submission, the pre script invokes the broker, and the post script waits till the execution is finished, and provides information about the actual job status for the portal
29 Broker invocation The portal can invoke different brokers to reach resources of different Grids While DAGMan schedules the workflow nodes, the brokers do the actual job submissions
30 Second step towards Grid Interoperability Manchester User Lausanne P-GRADE Portal NGS GT2 Poznan Budapest EGEE: VOCE / SEEGRID EGEE WMSGTbrokerNorduGrid broker SwissGrid
31 II. Conclusions Portals provide a uniform access to grids Managing multiple Brokers simultaneously in a transparent way seems to be a good solution to establish Grid Interoperability Though current portals provide a transparent access to grids, users still need to manually set up workflows and choose RMSs for each job in the workflow. Again, with examining the available brokers, users could learn the capabilities of the usable brokers, but they are lacking dynamic information, such as successful submission rate, background load of the VO of the brokers, reliability of the brokers and so on.
32 III. Meta-Brokering approaches Users can have certificates to access more Grids or VOs A new problem arises in this situation: which VO, which broker to choose for my specific application? Just like users needed Resource Brokers to choose proper resources within a VO, now they need a Meta-Brokering service to decide: which broker (and VO) is the best for them, and also to hide the differences of utilizing them.
33 Related works Meta-brokering is a quite new topic, though the need for interoperable grid networks has already been identified by different research groups. The InterGrid vision is to operate so-called Gateways communicating with IntraGrid RMs, which should be implemented in all the Grids participating in the network. This vision cannot be realized in current technologies. The HPC-Europa Project researchers also considered to take steps towards meta-brokering as well as the LA Grid Project. They are both thinking of an intercommunicating peer-to-peer architecture of their current RMSs, which also takes time and needs redesign of their brokers.
34 Interacting with the Meta-broker 1 2 VO 1 VO 2 VO 3VO 4 Grid X Grid Y User Meta-Broker 1 2 3
35 Languages of the Meta-Broker Job Submission Description Language (JSDL): for specifying job requirements extension for special attributes Broker Property Description Language (BPDL): for storing the properties of the utilized brokers updating the performance data of the brokers
36 Languages of the Meta-Broker
37 JSDL extension – undefined attributes
39 BPDL – Data Model
42 Job description (JSDL) EGEE WMS GTbroker NorduGrid Broker EGEE grid GT2 grid SwissGrid... Matchmaker Translator or Broker name, its JDL Job status, output Submission results Meta-Broker Core Invoker UserPortal... Information Collector BPDL List VO Load MB Languages MB Health Parser IS Agent Third step towards Grid Interoperability
43 Job request (JSDL) MatchMaker Translator Broker name/ID, Middleware/VO, its JobDL, proxyname Information Collector Meta-Broker Core User or Portal Parser IS Agent BPDL List VO Load MB Languages MB Health Broker ID, Middleware/VO, JobDL a. ) BrokerID, Submission results Scenarios
44 Job description (JSDL), Input files MatchMaker Translator Submission result, Output files Information Collector Meta-Broker Core Parser IS Agent BPDL List VO Load MB Languages MB Health Invoker 7. Grid Broker … Grid Broker b. ) User Scenarios
45 Components of the architecture I. The Meta-Broker is the core component: this communicates with the other components The Translators are responsible for transforming the user request to the language of the actually selected Broker (JSDL JDL, RSL, xRSL…) The Invokers hand over the job to the brokers and wait for the results They provide additional information for the Information Collector about the submissions
46 Components of the architecture II. The Information Collector stores the connected broker properties and historical data of the previous submissions This information shows: whether the chosen broker is available, or how reliable it is what kind of jobs can be submitted to which broker (some brokers provide QoS agreements, some better data-handling, …) what is the current load of the resources reachable by the utilized brokers – these values are regularly updated by IS Agents
47 Matchmaking The Matchmaker compares the JSDL of the actual job to the BPDL of the registered resource brokers First the basic attributes are matched against the basic properties: this selection determines a group of brokers that are able to submit the job In the second phase those brokers are kept, which are able to fulfill the special requirement attributes of the job Finally a priority list of the remaining brokers is created taking into account the ranks (stored for the requested features) and the load of the underlying grid of each broker
Manchester User Lausanne P-GRADE Portal NGS GT2 Poznan Budapest EGEE: VOCE / SEEGRID EGEE WMS GTbroker NorduGrid broker SwissGrid Meta-Broker Meta-Broker Utilization by Portals
49 III. Conclusions The introduced meta-brokering approach opens a new way for Interoperability support The design and the architecture of the Grid Meta-Broker enable a higher level resource management by utilizing resource brokers of different grid middleware systems This service can act as a bridge among the separated islands of the current production Grids, therefore it solves Grid Interoperability at the level of resource management We expect that with the integration of the Grid Meta- Broker to the portal, we will be able to enhance better application execution with a simplified and more interoperable service in the future.
50 Grid Interoperability levels Meta Portal
51 Final Conclusions We introduced three different approaches in current grid research that contributes to enable Grid Interoperability at the level of Resource Management. We have also demonstrated by solutions of all these approaches that interoperability can be achieved. Though the first two approaches, the RMS extension and multi- brokering offer interoperable heterogeneous resource utilization, the final solution lies in the third approach. The meta-brokering approach opens a novel way for Grid Interoperability support. The presented Meta-Broker is a standalone Web-Service that can serve both users and portals. We have shown, how such a service can be realized based on the latest Web and OGF standards. The design and the architecture of the Grid Meta-Broker enable a higher level interoperable brokering by utilizing existing resource brokers of different grid middleware.
CoreGrid Summer SchoolBudapest, Hungary, 3-7 September, Thank You for Your Attention!