GridFlow: Workflow Management for Grid Computing Kavita Shinde.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Operations Scheduling
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 9 Distributed Systems Architectures Slide 1 1 Chapter 9 Distributed Systems Architectures.
Lecturer: Sebastian Coope Ashton Building, Room G.18 COMP 201 web-page: Lecture.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.
Ajou University, South Korea ICSOC 2003 “Disconnected Operation Service in Mobile Grid Computing” Disconnected Operation Service in Mobile Grid Computing.
Chapter 3 Operating Systems Introduction to CS 1 st Semester, 2015 Sanghyun Park.
Self-Organizing Agents for Grid Load Balancing Junwei Cao Fifth IEEE/ACM International Workshop on Grid Computing (GRID'04)
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
1 소프트웨어공학 강좌 Chap 9. Distributed Systems Architectures - Architectural design for software that executes on more than one processor -
Agent-based Device Management in RFID Middleware Author : Zehao Liu, Fagui Liu, Kai Lin Reporter :郭瓊雯.
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
Fall 2000M.B. Ibáñez Lecture 01 Introduction What is an Operating System? The Evolution of Operating Systems Course Outline.
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
المحاضرة الاولى Operating Systems. The general objectives of this decision explain the concepts and the importance of operating systems and development.
CCGrid 2003, Tokyo, Japan GridFlow: Workflow Management for Grid Computing Junwei Cao ( 曹军威 ) C&C Research Labs, NEC Europe Ltd., Germany Stephen A. Jarvis.
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
1 Multiprocessor and Real-Time Scheduling Chapter 10 Real-Time scheduling will be covered in SYSC3303.
1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Example: Sorting on Distributed Computing Environment Apr 20,
Real-Time Systems Mark Stanovich. Introduction System with timing constraints (e.g., deadlines) What makes a real-time system different? – Meeting timing.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Job scheduling algorithm based on Berger model in cloud environment Advances in Engineering Software (2011) Baomin Xu,Chunyan Zhao,Enzhao Hua,Bin Hu 2013/1/251.
OSes: 3. OS Structs 1 Operating Systems v Objectives –summarise OSes from several perspectives Certificate Program in Software Development CSE-TC and CSIM,
Lecture 8: 9/19/2002CS149D Fall CS149D Elements of Computer Science Ayman Abdel-Hamid Department of Computer Science Old Dominion University Lecture.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Chapter 2 Processes and Threads Introduction 2.2 Processes A Process is the execution of a Program More specifically… – A process is a program.
Operating System Principles And Multitasking
Introduction to z/OS Basics © 2006 IBM Corporation Chapter 7: Batch processing and the Job Entry Subsystem (JES) Batch processing and JES.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Distributed System Architectures Yonsei University 2 nd Semester, 2014 Woo-Cheol Kim.
An Overview of Scientific Workflows: Domains & Applications Laboratoire Lorrain de Recherche en Informatique et ses Applications Presented by Khaled Gaaloul.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Chapter 3 Operating Systems. © 2005 Pearson Addison-Wesley. All rights reserved 3-2 Chapter 3 Operating Systems 3.1 The Evolution of Operating Systems.
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
06/08/10 P-GRADE Portal and MIMOS P-GRADE portal developments in the framework of the MIMOS-SZTAKI joint project Mohd Sidek Salleh MIMOS Berhad Zoltán.
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Use of Performance Prediction Techniques for Grid Management Junwei Cao University of Warwick April 2002.
Distributed Geospatial Information Processing (DGIP) Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
Advanced Operating Systems CS6025 Spring 2016 Processes and Threads (Chapter 2)
CT101: Computing Systems Introduction to Operating Systems.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
OPERATING SYSTEMS CS 3502 Fall 2017
Parallel Programming By J. H. Wang May 2, 2017.
Grid Computing.
Introduction to Operating System (OS)
Parallel Programming in C with MPI and OpenMP
Processor Management Damian Gordon.
Chapter 2: Operating-System Structures
Presented By: Darlene Banta
Database System Architectures
Chapter 2: Operating-System Structures
Processor Management Damian Gordon.
Towards Predictable Datacenter Networks
Presentation transcript:

GridFlow: Workflow Management for Grid Computing Kavita Shinde

Outline Introduction Grid Resource Management Grid Workflow Management An Example Scenario Conclusion

Introduction GridFow  given a set of workflow tasks and a set of resources,how do we map them to Grid resources?  workflow management systems developed at University of Warwick  developed on top of an agent-based resource management system for Grid computing(ARMS)  focus is on service-level scheduling and workflow management

Grid Resource Management Three Layers of resource management system within the GridFlow system  Grid Resource high-end computing or storage resource accessed remotely Multiprocessors, or clusters of workstations or PCs with large disk storage space  Local Grid multiple grid resources that belong to one organization resources are connected with high speed networks  Global Grid consists of all local Grids

Grid Resource Management PACE  a toolset for resource performance and usage analysis  takes separate resource and application models as inputs and is able to predict the execution time of a task prior to run time  scalability(execution time vs. level of parallelism) can be determine helps in preventing over-occupying of resources  useful when trying to interleave sub-workflows as much as possible

Grid Resource Management Titan  grid resource manager  locates a suitable resource set and passes the sub- workflow to a local scheduler  utilizes free processors to minimize idle-time and improve throughput  supported by the PACE performance predictive data

Grid Resource Management ARMS  main component – agent  agent – representative of a local grid at a global level of grid resource management  agents cooperate with each other to find the available resources and there characteristics dispatch requests that can not be satisfied locally to neighboring agents

Grid Workflow Management The implementation of grid workflow management is carried out at multiple layers  Tasks basic building block of application e.g.. MPI(Message Passing Interface) and PVM(Parallel Virtual Machine) jobs running on multiple processors tasks  Sub-workflows a flow of closely related tasks that is to be executed in a predefined sequence on grid resources of a local grid usually significant communication between tasks, but resource conflicts may occur when multiple sub-workflows require the same resource simultaneously  Workflows a flow of several different sub-workflows

GridFlow user portal  provides graphical user interface to compose workflow elements and access additional grid services LGSS  handles conflicts - scheduled sub- workflows may belong to different workflows ARMS  represents a local Grid at a global level of Grid resource management, and conducts local Grid sub-workflow scheduling Globus MDS  provides information about the available resources on the Grid and their status Titan  utilizes performance data obtained from PACE for resource scheduling

Grid Workflow Management GGWM  Simulation takes place before a grid workflow is actually executed, workflow schedule is achieved returns simulation results to GridFlow portal for user agreement  Execution executed according to the simulated schedule  the actual execution may differ - dynamic nature of grid delays - send back to the simulation engine & rescheduled  Monitoring provides access to real-time status reports of tasks or sub- workflow execution

Global Grid Workflow Management Scheduling Algorithm  initialize all properties of each sub-workflow – null  look for a schedulable sub-workflow ensure pre- sub-workflows have all been scheduled  configure the start time of the chosen sub-workflow to be the latest end time of its pre- sub-workflows  submit the start time and the sub-workflow to a grid level Agent(ARMS) finds a suitable local grid using LGSS

Global Grid Workflow Management  ARMS reschedules the less critical sub-workflows  algorithm relies heavily on the simulation results of LGSS

Workflow W : a set of sub- workflows S i (i=1,….n) S i and S n starting and ending points p i : number of pre- sub-workflows of S i q i : number of post- sub-workflows of S i G: global grid – set of local grids L j (j=1….m) k: true if sub-workflow is scheduled else false

Local Grid Sub-Workflow Scheduling Scheduling Algorithm  very similar to GGWM  has to deal with multiple tasks that may belong to different workflows  start time of the chosen task can’t be configured with the latest end time of its pre-tasks directly resource conflicts  Executes the task with the higher priority first gives higher priority to a possibly earlier enabled task

Fuzzy Time Operations LGSS and GGWM algorithms are implemented using fuzzy timing techniques fuzzy time function –  gives numerical estimate of the possibility that an event arrives at time  advantages: can be computed very fast suitable for scheduling time critical applications  they do not necessarily provide the best scheduling solution

 1 (  ) = 0.5(0,2,6,7)  2 (  ) = (2,4,4,6) a: possibility distributions of  1 and  2 b: latest arrival distribution of  1 and  2 c: earliest enabling time d: operator min – intersection of  1 and  2 e: operator max – union of  1 and  2 f: sum of  1 and  2  min(0.5,1)(0+2, 2+4, 6+4, 7+6)=0.5(2, 6, 10, 13)

An Example Scenario W 1, W 2 : Workflows L 1, L 2 : Local Grids task A 2 of sub-workflow S 3 from W 1 is being executed S 3 from W 2 is to be scheduled resource conflict between A 3 and A 4 schedule aims to find the  e 5 (  )

An Example Scenario task enabling times – from pre-task end times task execution times – from TITAN system supported by PACE functions  a 3 (  )=(3,5,5,7);  d 3 (  )=(5,6,7,8);  a 4 (  )=(0,3,3,5);  d 4 (  )=(10,12,14,16);  d 5 (  )=(2,5,6,9);

An Example Scenario using LGSS  s 3 (  ) = min{(3,5,5,7),earliest{(3,5,5,7),(0,3,3,5)}} = min{(3,5,5,7), ( 0,3,3,5)} = 0.5(3,4,4,5)  s 4 (  ) = min{(0,3,3,5),earliest{(3,5,5,7),(0,3,3,5)}} = min{(0,3,3,5), ( 0,3,3,5)} = (0,3,3,5)  e1 3 (  )= sum{0.5(3,4,4,5),(5,6,7,8)} = 0.5(8,10,11,13)

An Example Scenario  e1 4 (  )= sum{latest{0.5(8,10,11,13),(0,3,3,5)},(10,12,14,16)} = sum{0.5(8,10,11,13),(10,12,14,16)} = 0.5(18,22,25,29)  e2 4 (  )= sum{(0,3,3,5)},(10,12,14,16)} = (10,15,17,21)  e2 3 (  )= sum{latest{ (10,15,17,21),0.5(3,4,4,5)},(5,6,7,8)} = sun{0.5(10,12.5,26,29),(5,6,7,8)} = 0.5(15,18.5,26,29)  e 4 (  )= max{0.5(18,22,25,29),(10,15,17,21)} = (10,15,17,29)

An Example Scenario  e 5 (  )= sum{(10,15,17,29),(2,5,6,9)} = (12,20,23,38) so S 3 from W 2 will complete on local grid L1 most likely between 20 to 23 submit this data to GGWM – decides whether the local grid L 1 should be allocated the sub-workflow S 3 from W 2

Conclusion the fuzzy timing technique provides a good solution to the conflict solving problem arising from grid workflow management issue results indicate that local and global grid workflow management can coordinate with each other to optimize workflow execution time and solve conflicts of interest useful in highly dynamic grid environments large network latencies exists and application performance is difficult to predict accurately needs more flexible cooperation among different grid services and components which challenges security