Filterfresh Fault-tolerant Java Servers Through Active Replication Arash Baratloo www.cs.nyu.edu/phd_students/baratloo.

Slides:



Advertisements
Similar presentations
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Advertisements

Remote Procedure Call (RPC)
Copyright © 2001 Qusay H. Mahmoud RMI – Remote Method Invocation Introduction What is RMI? RMI System Architecture How does RMI work? Distributed Garbage.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Virtual Synchrony Ki Suh Lee Some slides are borrowed from Ken, Jared (cs ) and Justin (cs )
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Filterfresh COOTS’98 Department of Computer Science Courant Institute of Mathematical Sciences New York University Filterfresh: Hot Replication of Java.
Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group.
Group Communications Group communication: one source process sending a message to a group of processes: Destination is a group rather than a single process.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
CS 582 / CMPE 481 Distributed Systems Communications (cont.)
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
1 Dynamic Atomic Storage Without Consensus Alex Shraer (Technion) Joint work with: Marcos K. Aguilera (MSR), Idit Keidar (Technion), Dahlia Malkhi (MSR.
Smart Redundancy for Distributed Computation George Edwards Blue Cell Software, LLC Yuriy Brun University of Washington Jae young Bang University of Southern.
PROGRESS project: Internet-enabled monitoring and control of embedded systems (EES.5413)  Introduction Networked devices make their capabilities known.
Computer Science Lecture 2, page 1 CS677: Distributed OS Last Class: Introduction Distributed Systems – A collection of independent computers that appears.
Object Based Operating Systems1 Learning Objectives Object Orientation and its benefits Controversy over object based operating systems Object based operating.
1 A Framework for Highly Available Services Based on Group Communication Alan Fekete Idit Keidar University of Sidney MIT.
NFS. The Sun Network File System (NFS) An implementation and a specification of a software system for accessing remote files across LANs. The implementation.
Distributed Systems Lecture # 3. Administrivia Projects –Design and Implement a distributed file system Paper Discussions –Discuss papers as case studies.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
1 Chapter 2. Communication. STEM-PNU 2 Layered Protocol TCP/IP : de facto standard Our Major Concern Not always 7-layered Protocol But some other protocols.
Communication, Services, and Coordination. Communication and Coordination The Internet Architectures for coordination? What assumptions can we make: -
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 4 Communication.
ARMADA Middleware and Communication Services T. ABDELZAHER, M. BJORKLUND, S. DAWSON, W.-C. FENG, F. JAHANIAN, S. JOHNSON, P. MARRON, A. MEHRA, T. MITTON,
SPREAD TOOLKIT High performance messaging middleware Presented by Sayantam Dey Vipin Mehta.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
FailSafe SGI’s High Availability Solution Mayank Vasa MTS, Linux FailSafe Gatekeeper
4061 Session 25 (4/17). Today Briefly: Select and Poll Layered Protocols and the Internets Intro to Network Programming.
Toward Fault-tolerant P2P Systems: Constructing a Stable Virtual Peer from Multiple Unstable Peers Kota Abe, Tatsuya Ueda (Presenter), Masanori Shikano,
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Shuman Guo CSc 8320 Advanced Operating Systems
November NC state university Group Communication Specifications Gregory V Chockler, Idit Keidar, Roman Vitenberg Presented by – Jyothish S Varma.
Totally Ordered Broadcast in the face of Network Partitions [Keidar and Dolev,2000] INF5360 Student Presentation 4/3-08 Miran Damjanovic
Jini Architecture Introduction System Overview An Example.
DIS PROPOSAL - Distributed Data Warehouse - R 蔣孟儒 R 龍秋明.
Fault Tolerant Services
Remote Procedure Call Andy Wang Operating Systems COP 4610 / CGS 5765.
Scalable Group Communication for the Internet Idit Keidar MIT Lab for Computer Science Theory of Distributed Systems Group.
Jini Architectural Overview Li Ping
SysRép / 2.5A. SchiperEté The consensus problem.
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
Group Communication Theresa Nguyen ICS243f Spring 2001.
Remote Method Invocation A Client Server Approach.
Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.
Java Distributed Object Model A remote object is one whose methods can be invoked from another JVM on a different host. It implements one or more remote.
Ordering in online games Objectives – Understand the ordering requirements of gaming – Realise how ordering may be achieved – Be able to relate ordering.
Building a Reliable IP Multicast Distributed System Karl Thomas Rees CS 560.
Fault Tolerance (2). Topics r Reliable Group Communication.
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.
 2004 Deitel & Associates, Inc. All rights reserved. Chapter 17 – Introduction to Distributed Systems Outline 17.1Introduction 17.2Attributes of Distributed.
Replication Chapter Katherine Dawicki. Motivations Performance enhancement Increased availability Fault Tolerance.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
Replication & Fault Tolerance CONARD JAMES B. FARAON
Last Class: Introduction
Java Distributed Object System
03 – Remote invoaction Request-reply RPC RMI Coulouris 5
Active replication for fault tolerance
JINI ICS 243F- Distributed Systems Middleware, Spring 2001
Last Class: Fault Tolerance
Presentation transcript:

Filterfresh Fault-tolerant Java Servers Through Active Replication Arash Baratloo

Investigation of failure models in distributed Java applications Provide transparent fault-masking (to users and to programmers) Support highly available services in presence of failures Remove single-points of failure Filterfresh

Remote Method Invocation (RMI) 100% Java, hot, new, easy-to-use and Reliable Object Services (ROS) Interest in Providing: –support active-active replication –support Java objects Motivating Factors

Roadmap Motivation –RMI Registry & crash failures –RMI Server Architecture & crash failures –A Unified Solution -- process group approach –Fault-tolerant Registry –Fault-tolerant RMI –Conclusion

RMI in a Nutshell Servers register with the local registry Clients looks up a server at a well known registry Given a remote reference, client performs a remote method invocation

Limitations of RMI Registry The “well known registry” requirement too restrictive for failure recovery Single point of failure Can not support replicated servers, thus, highly available servers

FT Registry requires... Distribute and replicate registry servers Replication strategy to maintain a consistent state Failure detection and removal of failed registry servers Failed objects must be restarted automatically Dynamic addition of registry servers

RMI Architecture RRL assumes a stream-oriented transport Transport layer implemented on TCP/IP

Architecture (cont…)

Transparent FT system implies RRL or below

FT Servers Require... Distribute and replicate servers Replication strategy to maintain a consistent state Failure detection and removal of failed registry servers Dynamic addition of registry servers Object reference must remain valid after the associated object has failed

A Unified Solution... Process Group Approach where all non-faulty objects –form a group –consistent view of the group –interact through reliable group primitives -- all or nothing –total order on group primitives

Fortunately Process Group Membership is –well understood problem and protocols –well tested (ISIS, Transis, Amoeba, etc.) –basis for virtual synchrony Equivalent Problems* (implement one, get all) –Group Membership –Reliable Failure Detectors –Reliable and ordered multicast * Chandra and Toueg. Unreliable failure detectors for Reliable Distributed Systems. JACM, March 96.

Unfortunately Process Group Membership is –as hard as distributed consensus –impossible in purely asynchronous systems with crash failures* Our solution –the standard “timeout” assumption –variation of protocol used in Amoeba OS** * Chandra, Toueg, Hadzilacos and Charron-Bost. Impossibility of Group Membership in Asynchronous Systems. ** Oey, Langendoen and Bal. Comparing Kernel-level and User-level Communication protocols on Amoeba. ICDCS 95.

What We Provide... A Group Manager Class –100% Java –build on top of UDP/IP Implements –group creation –join operation (with state transfer) –leave operation –failure detection and recovery –reliable multicast All events are atomic and totally ordered

Multicast Performance Pentium Pro 200, Linux RedHat 4.0, Fast Ethernet hub

FT Registry Architecture registry on each host/domain group managers ensure reliable ordered events support dynamic joins w/state transfer

FT Registry Architecture (cont…) lookup becomes a local operation detect and remove failed objects consistent global state

FT Registry Performance Pentium Pro 200, Linux RedHat 4.0, Fast Ethernet, Ethernet hub

RMI & FT Registry support multiple servers register with a same name can now support recovery from server failure

What if... In the event of server failure...

Failure Recovery The old connection is patched with a connection to a non-faulty server Illusion of a valid object reference Transparent! A “reverse” lookup returns a name given a wire connection

Failure Recovery Performance ? Working but measurements have not been made

FT Server Architecture Client has the illusion of a single server In reality, we have active replicated servers Highly available?

Highly Available Servers Group managers ensure reliable ordering of events across all servers Guarantees servers have a consistent state Failure detection and removal of failed servers Dynamic addition of servers w/state transfer Illusion of a valid server reference even after the associated object has failed

Conclusions