Presentation is loading. Please wait.

Presentation is loading. Please wait.

WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.

Similar presentations


Presentation on theme: "WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing."— Presentation transcript:

1 WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski pukacki@man.poznan.pl kikas@man.poznan.pl Poznan Supercomputing And Networking Center

2 Introduction Final goal of WP 9 – GRMS: GridLab Resource Managenet System First prototype implementation – Scenario Broker

3 Scenario Broker functionality Ability to choose "the best" resource for the job execution, according to Job Description and chosen mapping algorithm; Ability to submit Simple Job according to provided Job Description; Ability to migrate Simple Job to better resource, according to provided Job Description; Ability to cancel job; Provides information about job status; Provides other information about job (name of host where the job is/was running, start time, finish time); Provides list of candidate resources for job execution (according to provided Job Description); Provides list of jobs submitted by given user; Ability to transfer input and output files (gridFTP, GAAS);

4 Scenario Broker - overview Resource Discovery Broker Job Manager Information System GRAM GridFTP GASS Web Services Interface Scenario BrokerGlobus Infrastructure Client (Application, Portal)

5 Scenario Broker modules Broker steering process of job submition choosing the best resources for job execution (scheduling algorithm) transferring input and output files for job's executable Resource Discovery finding resources that fulfils requirements described in Job Description providing information about resources, required for job scheduling

6 Scenario Broker modules (2) Job Manager Ability to check current status of job Ability to cancel running job Monitoring for status changes of runing job

7 Job Description Job executable file location arguments file argument (files which have to be present in working directory of running executable) environment variables standard input standard output standard error checkpoint files

8 Job Description (2) Resource requirements of executable name of host for job execution (if provided no scheduling algorithm is used) operating system required local resource management system minimum memory required minimum number of cpus required minimum speed of cpu other parameter passed directly to Globus GRAM

9 Job Description - example Linux 128 2 gsiftp://rage.man.poznan.pl/~/Apps/MyApp 12 abc gsiftp://rage.man.poznan.pl/~/Apps/appstdin.txt gsiftp://rage.man.poznan.pl/~/Apps/appstdout.txt

10 Collaboration Scenario Broker Adaptive Componets Data Management Infomation Services Portals Monitoring Security

11 Collaboration - working Data Management (WP8) – broker can use Replication System and Data Transfer System Adaptive Component (WP7) – broker gets additial parameters for job scheduling Information Services (WP10) – broker uses Information System to get information about resources

12 Collaboration - started Security (WP9) – work on scenarios of cooperation with Authorization Service Portals (WP4) – interfaces disscussion

13 Implementation Programming language: Java Interface: GSI enabled web service based on Axis toolkit. System: components implemented in CORBA technology. Lower level requirements: Globus 2.0 installed on managed machines GridFTP Resources registerd in Information System (MDS)

14 RM strategies behind the GRMS We have to ‘somehow’ take into account application requirements, specific characteristics and finally end-users preferences during an initial application scheduling (submission phase), We assume that application requirements could change during execution phase and we have to react ‘somehow’ to such application behaviours, We should also ‘somehow’ consider administrators and resource owners preferences and their objectives, time- reservation approaches (research part of WP9) We want to provide the GRMS’s interfaces to enable application developers as well as end-users to focus on high-level application design without scarifying application performance. To achieve this goal:

15 Observations System-level schedules focus on throughput Application-level schedules are not easily applied to new applications Many end-users, applications and admin domains are considered at the same time We need a balance between specific and generic approaches to scheduling GAT + GRMS = a bridge between an application-level and system-level scheduling Two phases: submission and execution

16 Submission phase (1) Application requirements – XML based resource specification language as a flexible way to express specific application needs and requirements, including: Hard constraints, e.g. OS = Linux, Mem > 512 MB, 4 CPUs, etc… Performance models e.g. in the form of AART model, Analytical, test and empirical models e.g. ET = 2.5(x * y) + CPU, End-users preferences e.g. time-based (“application respond time is very important for me” or “I have a lot of free time (I am on vacation :-) cost is very important for me”),

17 Submission phase (2) Matchmaking techniques – to select a set of resources which meet applications requirements (in general, lots of applications are submitted at the same time and lots of available resources). Scheduling problem is NP-hard – a number of possible solutions (schedules) increases exponentially depending on a problem instance size. A schedule with the best e.g. execution time is selected (in this case application execution times is assumed to be a priori known to a scheduler): Criteria: Cmax, Avg Cmax, Tardiness, etc. Scheduling algorithms: complete enumeration or heuristic, (research) Time-reservation (task workload is assumed to be a priori known to the scheduler). General assumption, the appropriate system must be installed on resources e.g. Maui, LSF. (research) Multi-objective resource management (many parameters are assumed to be a priori known to the multi-objective scheduler)

18 Execution phase Even an optimal schedule (e.g. a schedule with the shortest execution time) may need to be modified – dynamic changing resources as well as application requirements, Rescheduling and adaptive techniques are desirable: Zakopane migration scenario – the first step on the painful road (e.g. how to estimate cost of migration procesess?), WP7 Adaptive Components - adaptive strategies that let applications efficiently use the given resources, WP8 Data Management - due to input/output files locality requirements (typical for data-intensive appplications) it is often unwanted to transport or replicate all databases, files, etc. at all compute resources in distributed grid environment, WP4 Portals – to visualize the GRMS’s functionality WP6 Security, WP10,...

19 Plans Work on integration with other GridLab services. Authorization Service Work with client side Portals GAT API Testing with GAT enabled applications in GridLab testbed. Extensions to scenario broker in the context of resource management issues.


Download ppt "WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing."

Similar presentations


Ads by Google