1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department.

1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department of Electrical and Computer Engineering Alberto Montresor University of Bologna - Italy Department of Computer Science

2 © 2002-2003 Hein Meling and Alberto Montresor Context  (Distributed) systems that require Reliable and high-availability operation Fault tolerance (Load balancing)  Based on ”cheap” hardware and software Commercial off the shelf, and not custom hardware Heterogenous software (OS) architectures  Middleware architectures for distributed computing Middleware: between the application and OS

3 © 2002-2003 Hein Meling and Alberto Montresor Types of Failures  Processor failures Crash failures Value failures (very expensive)  Network failures  Operating System hangs  Memory leaks  Software design errors (beyond state-of-the-art)

4 © 2002-2003 Hein Meling and Alberto Montresor Overview  Jgroup A toolkit aimed at supporting the development of reliable and highly-available applications.  Autonomous Replication Management (ARM) A framework for server replica deployment and recovery without user intervention.  History Formal specification (1996-97) Algorithm description and Jgroup implementation Integration with existing technologies (Java RMI / Jini) The ARM framework (2000-03) Development of Jgroup-based applications

6 © 2002-2003 Hein Meling and Alberto Montresor The Problem  Some environments supporting distributed computing: CORBA (OMG) DCOM /.NET (Microsoft) Java RMI / Jini / EJB (Sun)  Characteristics: Object-oriented Based on client - server remote method invocations Promote modularity, reusability, interoperability, portability

7 © 2002-2003 Hein Meling and Alberto Montresor Java Remote Method Invocations  Java RMI protocol: enables objects residing in different JVMs to communicate through remote method invocations ClientServerStub Server- side RMI Runtime Network JVM1JVM2 method() return x

9 © 2002-2003 Hein Meling and Alberto Montresor The Problem  Distributed computing environments did not provide adequate support for developing reliable and high-available applications  Lack of reliable “one-to-many” interaction primitives From the client’s point of view: non-transparent access to replicated servers From the server’s point of view: no support for maintaining consistency

10 © 2002-2003 Hein Meling and Alberto Montresor The Solution: The Object Group Paradigm  Object group: A dynamic collection of server objects that cooperate in order to deliver some service and maintain shared state  Group method invocations: The act of invoking a method on an object group The method is executed by a certain number of servers in the object group, depending on the invocation semantics Clie nt Server Object Group

11 © 2002-2003 Hein Meling and Alberto Montresor The Solution: The Object Group Paradigm  From the client’s point of view: Groups must be transparent - like standard remote objects Clients need not be aware that they are interacting with an object group instead of a single server  From the server’s point of view: Server implementation - as transparent as possible Servers forming a group must cooperate to maintain shared state and to appear as a single object

12 © 2002-2003 Hein Meling and Alberto Montresor Group Communication  Group communication has been shown to be a powerful paradigm for supporting the development of dependable applications in distributed systems Management of dynamic groups (join/leave operations) Failure monitoring (crashes / partitionings) “One-to-many” communication Ordering of events (FIFO, Causal, Atomic) State synchronization tools Group Membership Service Reliable Multicast Service State Transfer Service

13 © 2002-2003 Hein Meling and Alberto Montresor Other Object Group Systems  CORBA Electra [Cornell, Zurich] Object Group Service (OGS) [EPFL, Lausanne] Eternal [UC Santa Barbara, Eternal Systems] Newtop [Newcastle, UK]  Java RMI Filterfresh [Bell Labs] JavaGroups [Cornell] Aroma [UC Santa Barbara]  DCOM Quintet [Cornell]

14 © 2002-2003 Hein Meling and Alberto Montresor Jgroup: “Yet Another Object Group Service”?  Support for partition-awareness: Modern wide-area communication networks are often characterized as highly partitionable Jgroup supports the development of reliable and high- available applications in partitionable systems  Moreover: Is extends modern technologies like Java RMI and Jini Is completely written in Java (portability) Supports complex merging service Extensible: deployment, recovery and upgrade facilities

15 © 2002-2003 Hein Meling and Alberto Montresor Autonomous Replication Management  Support for transparent replica deployment Placing server replicas on machines in the network Selecting machines so that each application can tolerate both network and machine failures  Support for replica recovery Jgroup detect and report failures ARM replace any crashed server replica with a new instance

17 © 2002-2003 Hein Meling and Alberto Montresor Group Membership  Group membership service tracks both voluntary and involuntary changes in the group’s membership  Variations are reported to group members through the installation of views  Installed views Consist of a collection of members Correspond to the group’s current membership as perceived by the members included in the view

19 © 2002-2003 Hein Meling and Alberto Montresor Partition-awareness  What kind of behavior can we expect from fault-tolerant applications in the presence of network partitioning?  The primary-partition approach: No service available ! How can I help You ? No service available !

20 © 2002-2003 Hein Meling and Alberto Montresor  Jgroup supports dependability in partitionable systems Development of applications aware of the existence of partitions (on the server-side) Partition-aware applications take advantage of their semantics in order to be more available Computations continue in all partitions of the system How can I help You ? Support for partition-awareness

23 © 2002-2003 Hein Meling and Alberto Montresor Comparison  Primary-partition approach +Easy to maintain a single, coherent shared state (strong consistency) -Servers in non-primary partitions unable to serve requests (low availability)  Partition-aware approach +Servers in multiple partitions may be able to serve requests (high availability) -Partitions evolve independently, possibly leading to inconsistent states (loose consistency)

24 © 2002-2003 Hein Meling and Alberto Montresor Comparison (Cont.)  Primary-partition approach +Development of fault-tolerant applications is simpler (active replication of existing non fault-tolerant servers) -Developers cannot exploit application semantics in order to provide a more available service  Partition-aware approach +Applications adapt their behavior and remain available in many partitions (perhaps by reducing their quality of service) -Development of fault-tolerant applications is more complex (case-by-case design is needed)

25 © 2002-2003 Hein Meling and Alberto Montresor The State Merging Problem  During partitioning, the state of servers belonging to distinct partitions may become inconsistent  When the partition disappears, an application-specific state merging protocol may be needed  Servers participating in the protocol try to define a new shared state that reconciles (when possible) the divergences Server Task Server Task

26 © 2002-2003 Hein Meling and Alberto Montresor The State Merging Problem  During partitioning, the state of servers belonging to distinct partitions may become inconsistent  When the partition disappears, an application-specific state merging protocol may be needed  Servers participating in the protocol try to define a new shared state that reconciles (when possible) the divergences Server Task Server Task

27 © 2002-2003 Hein Meling and Alberto Montresor The State Merging Problem  State merging protocols are based on the exchange of information among servers that have been partitioned  Jgroup provides a state merging service (SMS) that simplifies the development of state merging protocols  NOTE Determining what information needs to be exchanged how to use it to construct a new consistent shared state is an application-dependent problem

28 © 2002-2003 Hein Meling and Alberto Montresor General Schema for State Merging Protocols In each of the merging partitions, a coordinator is selected SMS interrogates each coordinator to obtain information about its current state State information from a coordinator is passed to servers that used to be partitioned from it Each of the servers merge information from coordinators with their own state S1 S2 S3 S4 getState() putState()

29 © 2002-2003 Hein Meling and Alberto Montresor General Schema for State Merging Protocols In each of the merging partitions, a coordinator is selected SMS interrogates each coordinator to obtain information about its current state State information from a coordinator is passed to servers that used to be partitioned from it Each of the servers merge information from coordinators with their own state S1 S2 S3 S4 getState() putState()

30 © 2002-2003 Hein Meling and Alberto Montresor Full Object-Orientation Server Clie nt Remote method invocations Message multicasting Stub  Existing object group systems fail to provide a completely object-oriented environment for software developers

31 © 2002-2003 Hein Meling and Alberto Montresor View Synchrony  View synchrony (1) If a correct server S executes an invocation during a view, then all servers within the view will also execute the invocation, or S will install a new view  View synchrony does not admit executions like this: S2S2 S3S3 S4S4 S1S1 admits

32 © 2002-2003 Hein Meling and Alberto Montresor View Synchrony  View Synchrony (2) All servers that survive from one view to the same next view execute the same set of invocations in the original view  View synchrony does not admit executions like this: S2S2 S3S3 S4S4 S1S1 admits

33 © 2002-2003 Hein Meling and Alberto Montresor Internal Group Method Invocations  Synchronous invocations The method invocation terminates by returning a vector of return values, one from each server at which the method was executed  Asynchronous invocations: The method invocation terminates immediately; replies (if any) are returned to a callback object Can be used to simulate message multicasting through void methods (one-way)

35 © 2002-2003 Hein Meling and Alberto Montresor Internal Invocations: example S1S1 S2S2 S3S3 ValuesCallback cb; group.getValue(cb); … int[] values = cb.getResults(); public class ValuesCallback implements Callback { void result(Object value); int[] getResults(); } int getValue() { return value }

36 © 2002-2003 Hein Meling and Alberto Montresor External Group Method Invocations  Anycast invocations: Are executed by at least one server in the object group (unless the client is partitioned from the group) Efficiency (same cost as standard RMI interactions) Useful for “read” methods on replicated databases  Multicast invocations: Are executed by all servers in a view, following the view synchrony semantics More costly (involve several servers) Useful for “write” methods on replicated databases

39 © 2002-2003 Hein Meling and Alberto Montresor Replication Management – The Problem  Object Group Systems support replication transparency: Membership management Reliable multicast  But does not support full failure transparency: Application or manual support to distribute replicas Application support or manual intervention required to recover from replica failures  Complicated tasks Application implementations prone to contain errors These tasks should not be left to the application developer

40 © 2002-2003 Hein Meling and Alberto Montresor Solution: Autonomous Replication Management  Support for creating object groups By placing individual members on distinct machines Each application may specify a replication policy For example, redundancy level = 3  Support for failure recovery Jgroup detects and reports failures to ARM ARM reacts by creating a replacement member for each failed member, perhaps on a different machine Each application may specify a recovery policy

43 © 2002-2003 Hein Meling and Alberto Montresor ARM: Recovery from Crash Failure ExecDaemon Router ux.his.no item.ntnu.no ReplicationM anager Management Client NettBankS erver Group Leader notifyViewChange() View agreement protocol

44 © 2002-2003 Hein Meling and Alberto Montresor ARM: Recovery from Crash Failure ExecDaemon Router ux.his.no item.ntnu.no ReplicationM anager Management Client NettBankS erver Group Leader notifyViewChange() createReplica() NettBankS erver

46 © 2002-2003 Hein Meling and Alberto Montresor Introduction to Jini  Jini is an API built on top of the Java 2 platform: enables spontaneous networks of devices/software services to assemble into federations of objects addresses the distribution problems in these federations through a set of simple interfaces and protocols Jini Network

47 © 2002-2003 Hein Meling and Alberto Montresor Jini Architecture  The components of the Jini architecture may be divided in three categories: Infrastructure i.e. the components that enables building a federated Jini system Model that “supports and encourages the production of reliable distributed services” Services that can be made part of a federated Jini system and which offer functionality to any other member of the federation Javaspaces

48 © 2002-2003 Hein Meling and Alberto Montresor Jini Infrastructure  The infrastructure is composed of: Java RMI protocol: enables objects residing in different JVMs to communicate through remote method invocations ClientServerStub Server- side RMI Runtime Network JVM1JVM2 method() return x

49 © 2002-2003 Hein Meling and Alberto Montresor Jini Infrastructure  The infrastructure is composed of: Lookup Service: defines how services may become part of a Jini system and clients retrieve services by their types and attributes. Client Lookup Service Server Stub Join Stu b Lookup Stu b Invocation Look up. Stub Discovery

50 © 2002-2003 Hein Meling and Alberto Montresor The Jini Programming Model  The programming model is based on three distinct paradigms for distributed computing: Leases extend the Java programming model by adding the time to the notion of holding a reference to a resource Transactions allow a set of operations on one or more remote participants to be grouped in such a way that either all succeed or all fail Events enable objects to register interest in changes of the abstract state of remote objects

51 © 2002-2003 Hein Meling and Alberto Montresor Jini and Fault Tolerance  Jini fault tolerance is based on leases and transactions leases enable the detection of service failures transactions provide consistency by guaranteeing “all-or-nothing” semantics  Unfortunately, no support for high-availability is present in Jini No support for replication Failure of transaction manager  clients and participants must wait for the recovery of the manager before serving further requests

52 © 2002-2003 Hein Meling and Alberto Montresor Enhancing Jini with Fault-Tolerance  Extending Jini with the Object Group Paradigm: Infrastructure Extending Java RMI for Group Method Invocations Extending the Lookup Service for dealing with Group Proxies Programming Model 1.Object Group Paradigm as alternative programming model 2.Integration between transactional and object group model Services Replicated JavaSpaces

53 © 2002-2003 Hein Meling and Alberto Montresor Extending Java RMI  RMI group at Javasoft designed Java RMI in order to be extensible The RemoteRef interface enables programmers to write their own references to remote objects on the client-side  Unfortunately, RemoteRef s are not sufficient There is no possibility to modify the behavior of RMI on the server side RemoteRef ClientStub Server- side RMI Runtime Server

54 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup Approach (Current Version) Server Proxy Server Client Proxy Client Statically or dynamically generated – implements the remote interface Fixed stub for server proxy RMI Stub Server- side RMI Runtime RMI Server Proxy Server Method dispatchers Multicast RMI Stub Server- side RMI Runtime

55 © 2002-2003 Hein Meling and Alberto Montresor Designing a New Java RMI API  We have cooperated with Sun Microsystems to design a new RMI API: Fully customizable, on both the client-side and the server- side Based on Dynamic Proxy Classes (JDK 1.3) (No need for static stub generators) Two different versions: One-to-one (remote method invocations) Voted down in JSR-078 Being included in the "Davis" release of Jini One-to-many (group method invocations)

56 © 2002-2003 Hein Meling and Alberto Montresor Server Proxy Server Client Proxy Client Statically or dynamically generated – implements the remote interface Server Proxy Server Method dispatchers Jgroup with 1-to-1 Customizable RMI RMI Stub Server- side RMI Runtime RMI Multicast RMI Stub Server- side RMI Runtime RMI

58 © 2002-2003 Hein Meling and Alberto Montresor Extending the Lookup Service  Jini enables the registration of customized proxies for services this feature can be used to register group proxies using any implementation of the lookup service  Group proxies, however, differ from standard proxies as their contents may be dynamic server registration  server reference added to group proxy server removal, lease expired  server reference removed from group proxy  We have developed an alternative implementation of the lookup specification capable to deal with group proxies

60 © 2002-2003 Hein Meling and Alberto Montresor Extending the Jini Programming Model  Jgroup + Jini programming model for fault-tolerance Leases + transactions Object group communication  Problem: transactions and group communication considered as separate aspects of fault-tolerance their composition does not result in any meaningful combination of their respective strengths  We need the possibility of using replication in transactions: Transaction managers Participants Clients

62 © 2002-2003 Hein Meling and Alberto Montresor Applications (Research)  Jgroup/ARM is being used for A distributed auction system Partitionable auctions [Panzieri, Amoroso et al., University of Bologna, 2002] An online-upgrade service for active replication [Solarski, GMD Fokus] A replication management framework Application-specific replication and recovery strategies [Meling, HiS] Dependable naming service Support for extensible group proxies (JERI) [Meling et al., HiS]

63 © 2002-2003 Hein Meling and Alberto Montresor Applications (Education)  Jgroup is being used at the Stavanger University College in the “Advanced Programming” course University of Bologna in the “Distributed System” course Norwegian University of Science and Technology in the “Dependable Systems” course  Source for several projects and thesis: Low-level communication protocols (Bologna) Replication services (Bologna) Wide-area distributed services (Padova) Management and deployment issues (HiS)

1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department.

Similar presentations

Presentation on theme: "1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department.

Similar presentations

Presentation on theme: "1 © 2002-2003 Hein Meling and Alberto Montresor The Jgroup/ARM Dependable Computing Toolkit Hein Meling Stavanger University College – Norway Department."— Presentation transcript:

Similar presentations

About project

Feedback