November, 19th GDS meeting, LIP6, Paris 1 Hierarchical Synchronization and Consistency in GDS Sébastien Monnet IRISA, Rennes.

Slides:



Advertisements
Similar presentations
CS542 Topics in Distributed Systems Diganta Goswami.
Advertisements

Christian Delbe1 Christian Delbé OASIS Team INRIA -- CNRS - I3S -- Univ. of Nice Sophia-Antipolis November Automatic Fault Tolerance in ProActive.
Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
1 Algorithms and protocols for distributed systems We have defined process groups as having peer or hierarchical structure and have seen that a coordinator.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
CMPT 431 Dr. Alexandra Fedorova Lecture XII: Replication.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Resource Management: Distributed Shared Memory
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Workshop.
Computing in the RAIN: A Reliable Array of Independent Nodes Group A3 Ka Hou Wong Jahanzeb Faizan Jonathan Sippel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
LEGO – Rennes, 3 Juillet 2007 Deploying Gfarm and JXTA-based applications using the ADAGE deployment tool Landry Breuil, Loïc Cudennec and Christian Perez.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
JuxMem: An Adaptive Supportive Platform for Data Sharing on the Grid Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan, France Grid Data.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
Replicated Databases. Reading Textbook: Ch.13 Textbook: Ch.13 FarkasCSCE Spring
Building Hierarchical Grid Storage Using the GFarm Global File System and the JuxMem Grid Data-Sharing Service Gabriel Antoniu, Lo ï c Cudennec, Majd Ghareeb.
The JuxMem-Gfarm Collaboration Enhancing the JuxMem Grid Data Sharing Service with Persistent Storage Using the Gfarm Global File System Gabriel Antoniu,
Improving the Efficiency of Fault-Tolerant Distributed Shared-Memory Algorithms Eli Sadovnik and Steven Homberg Second Annual MIT PRIMES Conference, May.
Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency protocols.
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
DISTRIBUTED COMPUTING
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Making a DSM Consistency Protocol Hierarchy-Aware: An Efficient Synchronization Scheme Gabriel Antoniu, Luc Bougé, Sébastien Lacour IRISA / INRIA & ENS.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
Group Communication Theresa Nguyen ICS243f Spring 2001.
Middleware for Fault Tolerant Applications Lihua Xu and Sheng Liu Jun, 05, 2003.
Lecture 12-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) October 4, 2012 Lecture 12 Mutual Exclusion.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Distributed Mutual Exclusion Synchronization in Distributed Systems Synchronization in distributed systems are often more difficult compared to synchronization.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
Seminar On Rain Technology
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Oct 1, 2015 Lecture 12: Mutual Exclusion All slides © IG.
Chapter 8 Fault Tolerance. Outline Introductions –Concepts –Failure models –Redundancy Process resilience –Groups and failure masking –Distributed agreement.
SEMINAR TOPIC ON “RAIN TECHNOLOGY”
rain technology (redundant array of independent nodes)
CSE 486/586 Distributed Systems Gossiping
Replication & Fault Tolerance CONARD JAMES B. FARAON
Distributed Shared Memory
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
6.4 Data and File Replication
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
Outline Announcements Fault Tolerance.
Outline Midterm results summary Distributed file systems – continued
Active replication for fault tolerance
Fault-Tolerant State Machine Replication
Distributed Resource Management: Distributed Shared Memory
Distributed Systems (15-440)
Presentation transcript:

November, 19th GDS meeting, LIP6, Paris 1 Hierarchical Synchronization and Consistency in GDS Sébastien Monnet IRISA, Rennes

November, 19thGDS meeting, LIP6, Paris2 JuxMem Consistency Protocol Currently Home Based  Home node responsible of a piece of data  Actions on the piece of data communication with the home node Home node Client Home

November, 19thGDS meeting, LIP6, Paris3 Replicated Home  The home node is replicated to tolerate failures  Thanks to active replications all replicas are up-to-date

November, 19thGDS meeting, LIP6, Paris4 Replication  Two layered architecture  Replication based on classical fault tolerant distributed algorithms  Implies a consensus between all nodes  Need for replicates in several clusters (locality) CommunicationsFailure detector Consensus Group communication and group membership Atomic multicast Adapter Fault tolerance Consistency Junction layer

November, 19thGDS meeting, LIP6, Paris5 Hierarchical Client GDG LDG GDG : Global Data Group LDG : Local Data Group

November, 19thGDS meeting, LIP6, Paris6 Synchronization Point of View  Naturally similar to data management  1 lock per piece of data  Pieces of data are strongly linked to their locks Synchronisation manager Client SM

November, 19thGDS meeting, LIP6, Paris7 Synchronization Point of View  The synchronization manager is replicated the same way

November, 19thGDS meeting, LIP6, Paris8 Synchronization Point of View Client

November, 19thGDS meeting, LIP6, Paris9 In Case of Failure  Failure of a provider (group member)  Held by the proactive group membership : the faulty provider is replaced by a new one  Failure of a client  With a lock => regenerate the token  Without a lock => do nothing  Failure of a whole local group  Very low probability  As if it was a client (as it is for the global group)

November, 19thGDS meeting, LIP6, Paris10 False Detection  Blocking unlocking with return code  To be sure that an operation as performed a client has to do something like: do{ lock(data) process(data) } while(unlock(data) is not ok) // here we’re sure that the action has been taken into account

November, 19thGDS meeting, LIP6, Paris11 Actual JuxMem’s Synchronization (Sum up)  Authorization based  Exclusive (acquire)  Non exclusive (acquireR)  Centralized (active replication)  Strongly coupled with data management  Hierarchical and fault tolerant

November, 19thGDS meeting, LIP6, Paris12 Data Updates : When?  Eager (current version) :  When a lock is released update all replicas  High fault tolerant level / Low performances Client

November, 19thGDS meeting, LIP6, Paris13 Data Updates : When?  Lazy (possible implementation) :  Update a local data group when a lock is acquired Client

November, 19thGDS meeting, LIP6, Paris14 Data Updates : When?  Intermediate (possible implementation) :  Allow a limited number of local update before propagating all the updates to the global level Client

November, 19thGDS meeting, LIP6, Paris15 Data Updates : When?  A hierarchical consistency model?  Local lock  Global lock

November, 19thGDS meeting, LIP6, Paris16 Distributed Synchronization Algorithms  Naïmi-Trehel’s  Token based  Mutual exclusion  Extented by REGAL  Hierarchical (Marin, Luciana, Pierre)  Fault tolerant (Julien)  Both?  A fault tolerant, grid aware synchronization module used by JuxMem?

November, 19thGDS meeting, LIP6, Paris17 Open Question and Future Work  Interface between JuxMem providers and synchronization module  Providers have to be informed of synchronization operations to perform updates  Future work (Julien & Sébastien)  Centralized data / distributed locks?  Data may become distributed in JuxMem (epidemic protocols, migratory replication, etc.)  Algorithms for token-based non-exclusive locks?  May allow more flexibility for replication techniques (passive or quorum based)

November, 19th GDS meeting, LIP6, Paris 18 Other Open Issues in JuxMem

November, 19thGDS meeting, LIP6, Paris19 Junction Layer  Decoupled design  Need to refine the junction layer Fault tolerance Consistency Junction layer Send Receive

November, 19thGDS meeting, LIP6, Paris20 Replication Degree  Actual features : the client specifies  The global data group cardinality (i.e number of clusters)  The local data groups cardinality (i.e number of replicas in each cluster)  Desirable features : the client specifies  The criticality degree of the piece of data  The access needs (model, required perfs)  A monitoring module  Integrated to Marin’s failure detectors?  Current MTBF, message losses, etc.  May allow JuxMem to dynamically deduce the replication degree for each piece of data

November, 19thGDS meeting, LIP6, Paris21 Application Needs  Access model  Data grain?  Access patterns  Multiple readers?  Locks shared across multiple clusters?  Data criticality  Are there different levels of criticality?  What kind of advice the application can give concerning those 2 aspects?  Duration of the application?  Traces : latency, crashes, message losses?