Presentation is loading. Please wait.

Presentation is loading. Please wait.

Industry Day Paris 10.09.07 Rodin Methodology for Developing Fault Tolerant Systems Elena Troubitsyna Åbo Akademi University, Turku, Finland.

Similar presentations


Presentation on theme: "Industry Day Paris 10.09.07 Rodin Methodology for Developing Fault Tolerant Systems Elena Troubitsyna Åbo Akademi University, Turku, Finland."— Presentation transcript:

1 Industry Day Paris 10.09.07 Rodin Methodology for Developing Fault Tolerant Systems Elena Troubitsyna Åbo Akademi University, Turku, Finland

2 Industry Day Paris 10.09.07 Motivation Formal methods and fault tolerance complement each other in achieving system dependability Formal methods help us to clean up architecture, handle complexity, facilitate verification Fault tolerance provide us with techniques to cope with failures of physical components RODIN integrates fault tolerance and formal methods in systems approach

3 Industry Day Paris 10.09.07 Talk outline Systems approach Fault tolerant control systems Distributed systems: fault tolerance in service-oriented development Middleware for fault tolerant multi-agent systems Replicated data-base systems: fault tolerant transactions

4 Industry Day Paris 10.09.07 Systems approach System approach assumes that while developing SW we have a picture of whole system in mind Software fault  “Bug” -- bad implementation of good requirements  Design fault -- good implementation of bad requirements We cannot obtain “good” requirements if we do not understand how the whole system works (and fails) M. Butler, E. Sekerinski, and K. Sere, An Action System Approach to the Steam Boiler Problem. In J.-R. Abrial, E. Borger and H.Langmaack, eds, Formal Methods for Industrial Applications: Specifying and Programming the Steam Boiler Control, LNCS 1165, 1996. I.Hayes, M.Jackson, C.Jones. Determining the specification of a control system from that of its environment. In K.Araki, S.Gnesi, D. Mandrioli (eds). FME 2003. Formal Methods. LNCS 2805, 2003

5 Industry Day Paris 10.09.07 Control Systems Computer Sensors Actuators Plant Environment (Plant) evolves Sensors "register" the state of plant Controller reads sensors and calculates how to set actuators to achieve the desired behaviour Controller sets actuators Traditional development : focus on controller (SW)

6 Industry Day Paris 10.09.07 Control Systems Computer Sensors Actuators Plant Environment (Plant) evolves Sensors "register" the state of plant Controller reads sensors and calculates how to set actuators to achieve the desired behaviour Controller sets actuators Systems approach: model entire system and derive controlling SW by refinement and decomposition Traditional development : focus on controller (SW)

7 Industry Day Paris 10.09.07 Developing fault tolerant control systems by refinement Abstract specification of entire system: abstract model of plant, routine control and failure. Safety invariant Specification with refined error detection mechanism: elaborated description of plant’s dynamics, representation of components, failures, error detection Specification of the system supplemented with redundancy: representation of redundant components, refinement of error detection, criticality of errors, error recovery Decomposition: the specification of overall system is split into specifications of the controller and the plant. Implementation: executable code of controller is produced.

8 Industry Day Paris 10.09.07 Benefits of systems approach Well-structured correct-by-construction development Assumptions about environment behaviour and fault assumptions are documented as a part of the model which allows us rigorously model fault tolerance Abstraction helps to tackle complexity Stepwise requirement capturing L.Laibinis and E.Troubitsyna,. Refinement of fault tolerant control systems in B. In M. Heisel, P. Liggesmeyer, S. Wittmann (Eds.), Proc. of SAFECOMP’2004, LNCS 3219

9 Industry Day Paris 10.09.07 Transient faults in control systems Rigorous development of controller subsystem (called Failure Management System in avionics) implementing mechanism for tolerating transient sensor faults Formal patterns for detecting sensor failures and recovering from them D.Ilic, E. Troubitsyna, L. Laibinis and C. Snook. Formal Development of Mechanisms for Tolerating Transient Faults. In M. Butler, C. Jones, A. Romanovsky and E. Troubitsyna (Eds.), Rigorous Development of Complex Fault-Tolerant Systems. LNCS 4157, 2006. Tolerating transient faults: to not overreact neither to neglect

10 Industry Day Paris 10.09.07 Fault tolerance in service-oriented development Lyra is service-oriented method for developing distributed communicating systems Design flow is based on the concepts of decomposition and preservation of externally observable behaviour The system behaviour is modularised and organised into layers according to external communication interfaces Distributed network architecture is derived from functional requirements via model transformations

11 Industry Day Paris 10.09.07 11 Service Service Specification Service Implementation Lyra Design Method Service specification: system- level services and interfaces are defined

12 Industry Day Paris 10.09.07 12 Service Service Specification Service Implementation Lyra Design Method Service decomposition:the abstract model is decomposed into a set of service components and interfaces between them

13 Industry Day Paris 10.09.07 13 Service Service Specification Service Implementation Lyra Design Method Service distribution: the logical architecture of services is distributed over a given network

14 Industry Day Paris 10.09.07 14 Service Service Specification Service Implementation Lyra Design Method Service implementation: low-level implementation details are added and platform specific code is generated

15 Industry Day Paris 10.09.07 Formalizing Lyra Service component is a coherent piece of functionality that provides its services to a service consumer via PSAP Formalized as ACC – Abstract Communicating Component consisting of  “kernel”, i.e., the provided functionality  “communication wrapper”, i.e., the communication channels via which data are supplied to and consumed from the component SYSTEM ACC …. EVENTS /* communicational */ input output /* functional */ calculate END Not only success but also service failure

16 Industry Day Paris 10.09.07 Service Decomposition Phase … SS1SS2SS3 SS N-1 SNSN S To provide service S system should execute subservices SS1..SS N In B model: decomposition is represented as refinement of the initial abstract pattern ACC New event: Service_Director orchistrates execution flow

17 Industry Day Paris 10.09.07 Service decomposition: faults in execution flow … SS1SS2SS3 SS N-1 SNSN S Error recovery by retrying execution of failed subservice

18 Industry Day Paris 10.09.07 Service decomposition: faults in execution flow … SS1SS2SS3 SS N-1 SNSN S Error recovery by rollback

19 Industry Day Paris 10.09.07 Service decomposition: faults in execution flow … SS1SS2SS3 SS N-1 SNSN S Service failure Success Unrecoverable error: Abort service execution

20 Industry Day Paris 10.09.07 Convergence of error recovery? … SS1SS2SS3 SS N-1 SNSN S Error recovery by retrying: infinite retry

21 Industry Day Paris 10.09.07 Convergence of error recovery? … SS1SS2SS3 SS N-1 SNSN S Error recovery by rollback: domino effect We introduce Maximal Service Response Time (Max_SRT) New event: Time decrements the execution time left

22 Industry Day Paris 10.09.07 Abort of service due to timeout … SS1SS2SS3 SS N-1 SNSN S Execution_time >Max_SRT

23 Industry Day Paris 10.09.07 Service Distribution (B Model) Service Distribution phase of Lyra corresponds to one or several B refinements Refinement steps introduce separate B components modelling external service components All new B components are specified according to the same (ACC) pattern

24 Industry Day Paris 10.09.07 Proposed approach establishes a basis for automating service-oriented development of fault tolerant communicating systems Formal verification will be hidden behind UML facade (hence smooth integration into existing development process)  L.Laibinis, E. Troubitsyna, S. Leppänen, J. Lilius and Q. Malik. Formal Service-Oriented Development of Fault Tolerant Communicating Systems. In M. Butler, C. Jones, A. Romanovsky and E. Troubitsyna (Eds.), Rigorous Development of Complex Fault-Tolerant Systems. LNCS 4157,2006.

25 Industry Day Paris 10.09.07 Fault tolerance in open systems: challenges openness of the multi-agent systems mobility and autonomy of agents asynchrony and anonymity of the communication complex types of faults: temporal loss of connectivity, mismatching interfaces

26 Industry Day Paris 10.09.07 Interoperability of agents Formal specification of middleware for mobile location- based systems Location is abstraction of context System approach: start from specification of location and agents together and arrive at the specification of entire middleware Decompose into part to be implemented by location and by agents Individual agents can be developed independently but preserve “standard” part

27 Industry Day Paris 10.09.07 Abstract specification Implicit modelling of normal termination and failure

28 Industry Day Paris 10.09.07 Handling agent failure Refining Disengage Agent can complete its activity and disengage Agent activity can be terminated by a (detectable) crash Agent can silently crash (e.g., disconnect or become slow)

29 Industry Day Paris 10.09.07 Compatibility on functional level Scopes partition coordination space Each scope supports certain set of roles – abstraction of agent functionality Formal definition of scope properties Compatibility on the level of agent functionality

30 Industry Day Paris 10.09.07 System approach in MAS Ensuring interoperability of the independently developed agents and supporting this by top-down stepwise development methods Identifying and verifying system properties that express specific fault tolerance and mobility-related characteristics Formal specifications in B provide input to model checking of dynamic properties  L.Laibinis, E.Troubitsyna, A.Iliasov and A.Romanovsky. Rigorous Development of Fault-Tolerant Agent Systems. In M. Butler, C. Jones, A. Romanovsky and E. Troubitsyna (Eds.), Rigorous Development of Complex Fault-Tolerant Systems. LNCS 4157,2006.  A.Iliasov, V.Khomenko, M.Koutny, A.Romanovsky. On Specification and Verification of Location-Based Fault Tolerant Mobile Systems. In M. Butler, C. Jones, A. Romanovsky and E. Troubitsyna (Eds.), Rigorous Development of Complex Fault-Tolerant Systems. LNCS 4157,2006.

31 Industry Day Paris 10.09.07 Fault tolerant transactions in replicated database systems D.Yadav, M.Butler. Rigorous design of Fault Tolerant Transations for replicated database systems using Event B. In M. Butler, C. Jones, A. Romanovsky and E. Troubitsyna (Eds.), Rigorous Development of Complex Fault-Tolerant Systems. LNCS 4157,2006. Replication improves availability of distributed database systems when the transaction workload is predominantly read only Keeping replicas identical during updates is difficult due to site failures and conflicting transactions One Copy Serializability criterion: Interleaved execution of transactions in replicas should be equivalent to serial execution of those transactions on one copy of database

32 Industry Day Paris 10.09.07 Approach  Verify through refinement that design of replicated database satisfies One Copy Serializability criterion  Abstract Model is based on Single Copy Database  Refinement is based on Replicated Database  Gluing Invariants discovered by B Tools defines relationship among abstract single copy database and replicated database.  The gluing invariants provides an deeper understanding of system and facilitates further development of more complex replica control mechanism.

33 Industry Day Paris 10.09.07 Other topics in RODIN methodology Extension of Event B language by J.-R. Abrial Library of case studies in J.-R. Abrial book Extension of Event B to represent records by N.Evans and M.Butler Model checking of mobile fault tolerant systems by M.Koutny et al. Methods for rigorous development of generic requirements patterns by C.Snook, M.Poppleton and I.Johnson Formalization of UML in B by C.Snook, M.Butler, M.Walden Design of various exception handling approaches by A.Iliasov and A.Romanovsky


Download ppt "Industry Day Paris 10.09.07 Rodin Methodology for Developing Fault Tolerant Systems Elena Troubitsyna Åbo Akademi University, Turku, Finland."

Similar presentations


Ads by Google