More on Adaptivity in Grids Sathish S. Vadhiyar Source/Credits: Figures from the referenced papers.

Slides:



Advertisements
Similar presentations
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Experiences.
Advertisements

Practical techniques & Examples
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Decision Trees and MPI Collective Algorithm Selection Problem Jelena Pje¡sivac-Grbovi´c,Graham E. Fagg, Thara Angskun, George Bosilca, and Jack J. Dongarra,
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Parallel and Distributed Simulation Time Warp: Other Mechanisms.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
1 Complexity of Network Synchronization Raeda Naamnieh.
Computer Science Department 1 Load Balancing and Grid Computing David Finkel Computer Science Department Worcester Polytechnic Institute.
G Robert Grimm New York University Fine-grained Mobility (in Emerald)
Strategies for Implementing Dynamic Load Sharing.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Operating Systems (CSCI2413) Lecture 3 Processes phones off (please)
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Presented by: Randeep Singh Gakhal CMPT 886, July 2004.
Dynamic Load Sharing and Balancing Sig Freund. Outline Introduction Distributed vs. Traditional scheduling Process Interaction models Distributed Systems.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Itrat Rasool Quadri ST ID COE-543 Wireless and Mobile Networks
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2007 (TPDS 2007)
Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.
Dynamic Load Balancing Tree and Structured Computations CS433 Laxmikant Kale Spring 2001.
Load Balancing in Distributed Computing Systems Using Fuzzy Expert Systems Author Dept. Comput. Eng., Alexandria Inst. of Technol. Content Type Conferences.
DNA REASSEMBLY Using Javaspace Sung-Ho Maeung Laura Neureuter.
Simple and Fault-Tolerant Key Agreement for Dynamic Collaborative Groups David Insel John Stephens Shawn Smith Shaun Jamieson.
Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.
DLS on Star (Single-level tree) Networks Background: A simple network model for DLS is the star network with a master-worker platform. It consists of a.
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
Locating Mobile Agents in Distributed Computing Environment.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
Work Stealing and Persistence-based Load Balancers for Iterative Overdecomposed Applications Jonathan Lifflander, Sriram Krishnamoorthy, Laxmikant V. Kale.
1 Process migration n why migrate processes n main concepts n PM design objectives n design issues n freezing and restarting a process n address space.
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
Fault-tolerant Scheduling of Fine- grained Tasks in Grid Environments.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Fault Tolerant Extensions to Charm++ and AMPI presented by Sayantan Chakravorty Chao Huang, Celso Mendes, Gengbin Zheng, Lixia Shi.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Data Structures and Algorithms in Parallel Computing Lecture 2.
1 Fault-Tolerant Mechanism for Hierarchical Branch and Bound Algorithm Université A/Mira de Béjaïa CEntre de Recherche sur l’Information Scientifique et.
Memory Coherence in Shared Virtual Memory System ACM Transactions on Computer Science(TOCS), 1989 KAI LI Princeton University PAUL HUDAK Yale University.
Computer Sciences Department1.  Property 1: each node can have up to two successor nodes (children)  The predecessor node of a node is called its.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Fault Tolerance and Checkpointing - Sathish Vadhiyar.
By Nitin Bahadur Gokul Nadathur Department of Computer Sciences University of Wisconsin-Madison Spring 2000.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Fault Tolerance (2). Topics r Reliable Group Communication.
1 Kyung Hee University Chapter 11 User Datagram Protocol.
Dynamic Load Balancing Tree and Structured Computations.
Divide and Conquer Algorithms Sathish Vadhiyar. Introduction  One of the important parallel algorithm models  The idea is to decompose the problem into.
1 An unattended, fault-tolerant approach for the execution of distributed applications Manuel Rodríguez-Pascual, Rafael Mayo-García CIEMAT Madrid, Spain.
High level programming for the Grid Gosia Wrzesinska Dept. of Computer Science Vrije Universiteit Amsterdam vrije Universiteit.
Fault tolerance, malleability and migration for divide-and-conquer applications on the Grid Gosia Wrzesińska, Rob V. van Nieuwpoort, Jason Maassen, Henri.
Processes and threads.
AA Trees.
Parallel Graph Algorithms
湖南大学-信息科学与工程学院-计算机与科学系
Multi-Way Search Trees
CS 213 Lecture 11: Multiprocessor 3: Directory Organization
Parallel and Distributed Simulation
Outline Announcement Distributed scheduling – continued
Adaptivity and Dynamic Load Balancing
Presentation transcript:

More on Adaptivity in Grids Sathish S. Vadhiyar Source/Credits: Figures from the referenced papers

Fault-Tolerance, Malleability and Migration for Divide-and- Conquer Applications on the Grid Wrzesinska et al.

Fault-Tolerance, Malleability and Migration for Divide-and-Conquer Applications on the Grid  3 general class of divisible applications  Master-worker paradigm – 1 level  Hierarchical master-worker grid system – 2 levels  Divide-and-conquer paradigm – allows computation to be split up in a general way. E.g. search algorithms, ray tracing etc.  The work deals with mechanisms to deal with processors leaving  Handling partial results from leaving processors  Handling orphan work  2 cases of processors leaving  When processors leave gracefully (e.g. when processor reservation comes to an end)  When processors crash  Restructuring computation tree

Introduction  Divide-and-conquer  Recursive subdivision; After solving subproblems, their results are recursively combined until the final solution is reached.  Work is distributed across processors by work-stealing  When a processor runs out of work, it picks another processor at random and steals a jobs from its work queue  After computing the jobs, the result is returned to the originating processor  Have a work-stealing algorithm called CRS (Cluster- aware random stealing) that overlaps intra-cluster steals with inter-cluster steals

Malleability  Adding a new machine to a divide-and-conquer computation is simple  New machine starts stealing jobs from other machines  Leaving of a processor - Restructuring of the computation tree to reuse as many partial results as possible  What happens when processors leave  remaining processors are notified by leaving processor (when processors leave gracefully)  detected by the communication layer (in unexpected leaves)

Recomputing jobs stolen by leaving processors  Each processor maintains a list of jobs stolen from it and the processor Ids of the thieves  When processors leave  Each of the remaining processors traverses its stolen jobs list, searches for jobs stolen by leaving processors  Such jobs are put back in the work queues of owners, marked as “restarted”  Children of “restarted” jobs are also marked as “restarted” when they are spawned

Example

Example (Contd…)

Orphan Jobs  Jobs stolen from leaving processors  Existing approaches  Processor working on an orphan job must discard the result, since it does not know where to return the result  Need to know the new address to return the result  Salvaging orphan jobs requires creating the link between the orphan and its restarted parent

Orphan Jobs (Contd…)  For each finished orphan job  Broadcast of a small message containing the jobID of the orphan and the processorID that computed the orphan  Abort unfinished intermediate nodes of orphan subtrees  (jobID, processorID) stored by each processor in a local orphan table

Orphan Jobs (Contd…)  When a processor tries to recompute “restarted” jobs  Processors perform lookup in orphan table  If the jobIDS match, the processor removes it from the workqueue, puts it in the list of stolen jobs  Send message to the orphan owner requesting result of the job  Orphan owner marks it as stolen from the sender of the request  Link between restarted parent and orphaned child is restored  Reusing orphans improves performance of the system

Example

Partial Results on Leaving Processors  If a processor knows it has to leave:  Chooses another processor randomly  Transfers all results of finished jobs to the other processor  The jobs are treated as orphan jobs  Processor receiving the finished jobs broadcasts a (jobID, processorID) tuple  Partial results linked to the restarted parents

Special Cases  Master leaving – special case; owns root job that was not stolen from anyone  Remaining processors elect the new master which will respawn the root job  New run will reuse partial results of orphan jobs from previous run  Adding processors  New processor downloads an orphan table from one of the other processors  Piggybacks orphan table requests with steal requests  Message combining  One small (broadcast) message has to be sent for each orphan and for each computed job in the leaving processor  Messages are combined

Results  3 Types  Overhead when no processors are leaving  Comparison with traditional approach that does not save orphans  To show that mechanism can be used for efficient migration of the computation  Testbeds  DAS-2 system, 5 clusters in five Dutch Universities  European GridLab – 24 processors in 4 sites in Europe  8 in Leiden and 8 in Delft (DAS-2)  4 in Berlin  4 in Brno

Overhead during normal Execution  4 applications on a system with and without their mechanisms  RayTracer, TSP, SAT solver, Knapsack problem  Overhead is negligible

Impact of Salvaging Partial Results  RayTracer Application  2 DAS-2 clusters with 16 processors each  Removed one cluster in the middle of the computation, after half of the time it would take on 2 clusters without processors leaving  Comparison of  Traditional approach (without saving partial results)  Recomputing trees when processors leave unexpectedly  Recomputing trees when processors leave gracefully  Runtime on 1.5 clusters (16 on processors in 1 cluster and 8 processors in another cluster)  Difference between last two gives overhead of transferring the partial results from leaving processors and the work lost because of the leaving processors

Results

Migration  Replaced one cluster with another  Raytracer application on 3 clusters  In the middle of the computation, one cluster was gracefully removed, and another identical cluster added  Comparison without migration  Overhead of migration – 2%

References  Predicting the cost and benefit of adapting data parallel applications in clusters. Journal of Parallel and Distributed Computing. Volume 62, Issue 8 (August 2002) Pages: Year of Publication: 2002 Author Jon B. Weissman Jon B. Weissman Jon B. Weissman  Fault-Tolerance, Malleability and Migration for Divide-and-Conquer Applications on the Grid," Parallel and Distributed Processing Symposium, Proceedings. 19th IEEE International, vol., no.pp. 13a- 13a, April 2005

Predicting the Cost and Benefit of Adapting Data Parallel Applications in Clusters – Jon Weissman  Library of adaptation techniques  Migration  Involves remote process creation followed by transmission of old worker’s data to new worker  Dynamic load balancing  Collecting load indices, determining redistribution and initiating data transmission  Addition or removal of processors  Followed by data transmission to maintain load balance  Library calls to detect and initiate adaptation actions within the applications  Adaptation event sent from an external detector to all workers