Pontus Boström and Marina Waldén Åbo Akademi University/ TUCS Development of Fault Tolerant Grid Applications Using Distributed B.

Slides:



Advertisements
Similar presentations
웹 서비스 개요.
Advertisements

Construction process lasts until coding and testing is completed consists of design and implementation reasons for this phase –analysis model is not sufficiently.
What is RMI? Remote Method Invocation –A true distributed computing application interface for Java, written to provide easy access to objects existing.
COM vs. CORBA.
Web Service Ahmed Gamal Ahmed Nile University Bioinformatics Group
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
Distributed Objects and Remote Invocation
Distributed Object & Remote Invocation Vidya Satyanarayanan.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Formal Methods in Software Engineering Credit Hours: 3+0 By: Qaisar Javaid Assistant Professor Formal Methods in Software Engineering1.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Transparent Robustness in Service Aggregates Onyeka Ezenwoye School of Computing and Information Sciences Florida International University May 2006.
Distributed components
1 Ivan Lanese Computer Science Department University of Bologna Italy On the Interplay between Fault Handling and Request-response Service Invocations.
1 The SOCK SAGA Ivan Lanese Computer Science Department University of Bologna Italy Joint work with Gianluigi Zavattaro.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Tutorials 2 A programmer can use two approaches when designing a distributed application. Describe what are they? Communication-Oriented Design Begin with.
CS533 - Concepts of Operating Systems 1 Remote Procedure Calls - Alan West.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
OCT 1 Master of Information System Management Organizational Communications and Distributed Object Technologies Lecture 5: Distributed Objects.
.NET Mobile Application Development Remote Procedure Call.
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
Messaging Technologies Group: Yuzhou Xia Yi Tan Jianxiao Zhai.
Beyond DHTML So far we have seen and used: CGI programs (using Perl ) and SSI on server side Java Script, VB Script, CSS and DOM on client side. For some.
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Other Topics RPC & Middleware.
1 Chapter 38 RPC and Middleware. 2 Middleware  Tools to help programmers  Makes client-server programming  Easier  Faster  Makes resulting software.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
CS425 /CSE424/ECE428 – Distributed Systems – Fall Nikita Borisov - UIUC1 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S.
Lecture 15 Introduction to Web Services Web Service Applications.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
RPC Design Issues Presented By Gayathri Vijay S-8,CSE.
Cloud Age Time to change the programming paradigm?
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
Grid Services I - Concepts
Copyright © 2013 Curt Hill SOAP Protocol for exchanging data and Enabling Web Services.
IS473 Distributed Systems CHAPTER 5 Distributed Objects & Remote Invocation.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
CORBA1 Distributed Software Systems Any software system can be physically distributed By distributed coupling we get the following:  Improved performance.
Shuman Guo CSc 8320 Advanced Operating Systems
S imple O bject A ccess P rotocol Karthikeyan Chandrasekaran & Nandakumar Padmanabhan.
Distributed Objects & Remote Invocation
Grid Computing Environment Shell By Mehmet Nacar Las Vegas, June 2003.
Jini Architecture Introduction System Overview An Example.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
- Manvitha Potluri. Client-Server Communication It can be performed in two ways 1. Client-server communication using TCP 2. Client-server communication.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
1 Chapter 38 RPC and Middleware. 2 Middleware  Tools to help programmers  Makes client-server programming  Easier  Faster  Makes resulting software.
Web Services An Introduction Copyright © Curt Hill.
Reliable Client-Server Communication. Reliable Communication So far: Concentrated on process resilience (by means of process groups). What about reliable.
Middleware for Fault Tolerant Applications Lihua Xu and Sheng Liu Jun, 05, 2003.
Distributed Systems Lecture 8 RPC and marshalling 1.
Distributed Computing & Embedded Systems Chapter 4: Remote Method Invocation Dr. Umair Ali Khan.
Topic 4: Distributed Objects Dr. Ayman Srour Faculty of Applied Engineering and Urban Planning University of Palestine.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
03 – Remote invoaction Request-reply RPC RMI Coulouris 5
Sabri Kızanlık Ural Emekçi
What is RMI? Remote Method Invocation
Programming Models for Distributed Application
Inventory of Distributed Computing Concepts and Web services
Fault Tolerance Distributed Web-based Systems
Inventory of Distributed Computing Concepts
Middleware for Fault Tolerant Applications
Lecture 6: RPC (exercises/questions)
Lecture 6: RPC (exercises/questions)
Abstractions for Fault Tolerance
Presentation transcript:

Pontus Boström and Marina Waldén Åbo Akademi University/ TUCS Development of Fault Tolerant Grid Applications Using Distributed B

Motivation Grids have become widespread in organizations Handle large amount of information Manage computational resources Difficult to implement “correct” Grid applications Formal methods useful in order to ensure correctness of specifications Can be difficult to implement The specification language should take into account the features of the underlying platform Fault tolerance also important for correctness due to the nature of the grid environment

Grids Used for large-scale distributed systems Scientific computing, e.g., in Physics and engineering Business applications Share information and computational resources over organizational boundaries Loosely coupled systems SOA Client – Server architecture

Grid services Services in a grid environment that can be accessed by clients Similar to remote objects in CORBA and RMI Remote procedures used for communication Host Client Grid service

Grid Services Based on Web Services XML SOAP WSDL Extends Web services with Potentially transient services containing state Service data Notifications Globus Toolkit middleware ClientGrid service Remote procedure call Notification

The architecture of a grid application OS Globus T. Client Grid service RPC Notification Host 1 Host 2

Faults Remote procedure calls can fail in four different ways 1.The server grid service instance has crashed before the call 2.The network connection fails when calling a remote procedure 3.The server instance fails during the call 4.The network connection fails when returning the result A notification can fail to arrive for two reasons 1.The sending grid service crashed before sending the notification 2.The network connection fails during sending The client crashes when using a server grid service The server grid service becomes an orphan

Fault tolerance using GT No support for advanced fault tolerance mechanisms such as replication or check-pointing Exception is raised in the caller when a call to a remote procedure fails Not easy to know what caused the exception to be raised That a notifications is lost can be discovered with timers in the client The most difficult error to handle is removal of orphan grid service instances

Orphan control Need to remove orphan grid service instances, since they waste resources New remote procedure isAlive Timer in both client and server grid service

Orphan control Need to remove orphan grid service instances, since they waste resources New remote procedure isAlive Timer in both client and server grid service An exception is raised in the client when a call to isAlive fails The server grid service is deleted when a timeout occurs in it

Event B Extension of the B Method Developed by J. R. Abrial Based on Action Systems by Back and Kurki-Suonio Related To B Action Systems SYSTEM C VARIABLES x INVARIANT Inv_C INITIALISATION x := x0 EVENTS C_Evt1 = ANY u WHERE G1(u,x) THEN S1 END; C_Evt2 = SELECT G2 THEN S2 END; END

Formal development of Grid applications We like to have a formal method suitable for developing fault tolerant grid applications Difficult to create implementable specifications of grid applications in Event B No grid communication mechanisms such as remote procedures and notifications No fault tolerance mechanisms Difficult to implement due to synchronization issues and atomicity of events We need to extend Event B with constructs for Specifying grid services Remote procedure calls and notifications Fault tolerance Extensions should be introduced in a manner that simplifies implementation

Distributed B Provides two new types of B machines GRIDSERVICE GRID_REFINEMENT Take into account grid specific features Remote procedures Notifications Timeouts due to lost notifications Exceptions due to failed calls to isAlive Enables us to prove properties about the entire system Are translated to ordinary B for verification New constructs get their semantics from the translation Automatic generation of proof obligations Enable automatic or semi-automatic translation of the specification to a programming language

Grid service machine Abstract specification of a grid service A grid service machine is a template that clients obtain instances of Compare to Classes in OO Remote procedures Ordinary B procedures called from a client Events Executed independently of a client Notifications Sent when all events have become disabled Proc(p) (J1  J2)  Q J2T2 J1T1 Grid service Remote procedures: Events: Notifications:

Grid refinement machine (1) A client that uses grid service machine instances Refines GRIDSERVICE, ordinary SYSTEM or REFINEMENT Clause for enabling dynamic management of grid service machine instances Instances are used as variables When a failed instance is discovered it is marked as no longer in use and deleted from the application Clause for refining remote procedures Clause for refining events

Grid refinement machine (2) Special substitution used in events for making remote procedures calls Enables the exceptions for failed calls to be handled Special events that consists of two parts for handling notifications First part enabled when a notification has been sent from a grid service Second part enabled when a timeout occurs Executed once for each notification/timeout Special event for handling failed calls to isAlive Enabled for each grid service instance in use. Non-deterministically models failures of instances

The behaviour of grid components Proc(p) (J1  J2)  Q G2S2 NotifHandler J2T2G3S3 J1T1 Grid serviceGrid refinement G1S1 Remote procedures: Events: Notifications: Notification handlers: Events:

Grid service machine GRIDSERVICE A VARIABLES y INVARIANT Inv_A INITIALISATION y := y0 REMOTE_PROCEDURES Proc(p) = PRE P(p) THEN T END EVENTS A_Evt1 = ANY u WHERE J1(u,y) THEN T1 END; A_Evt2 = SELECT J2 THEN T2 END; NOTIFICATIONS Notif = GUARANTEES Q END END

Grid refinement machine (1) GRID_REFINEMENT C2 REFINES C1 REFERENCES A VARIABLES z,x,a_inst INVARIANT a_inst:A & Inv_C’ INITIALISATION x := x0 || z:=z0 || a_inst::A EVENTS C_Evt1 = SELECT G1’ THEN CALL a_inst.Proc(x) EXCEPTION S E END || S1’ END; C_Evt2 = SELECT G2’ THEN S2’ END

Grid refinement machine (2) NOTIFICATION_HANDLERS NotifHandler = NOTIFICATION Notif SOURCE v:A THEN S3 TIMEOUT ST END IS_ALIVE_HANDLERS IAHandler = SOURCE v:A THEN S IA END

Conclusions Enables construction of correct fault tolerant grid applications Automatic generation of proof obligations Implementable architecture by construction These Event B extensions can also use other middleware for distributed systems