Presentation is loading. Please wait.

Presentation is loading. Please wait.

International Service Availability Symposium (ISAS) 2007

Similar presentations


Presentation on theme: "International Service Availability Symposium (ISAS) 2007"— Presentation transcript:

1 International Service Availability Symposium (ISAS) 2007
MDDPro: Model-Driven Dependability Provisioning in Distributed Real-time and Embedded Systems Sumant Tambe* Jaiganesh Balasubramanian Aniruddha Gokhale Thomas Damiano Vanderbilt University, Nashville, TN, USA Contact : International Service Availability Symposium (ISAS) 2007 May 21-22, 2007, University of New Hampshire, Durham, New Hampshire, USA This work is supported by subcontracts from LMCO, BBN & Raytheon

2 Component-based DRE Systems
Characteristics of component-based enterprise DRE systems Applications composed of one or more “operational string” of services or systems of systems Simultaneous QoS (Availability, Time Critical) requirements Dynamic (re)-deployment of components into operational strings Examples of DRE systems Advanced air-traffic control systems Continuous patient monitoring systems Add yellow box stating goal of RT FT in EDRE. Goal: Simplify and automate Fault-Tolerance provisioning in the DRE systems

3 Fault-Tolerance Design Considerations in DRE Systems
Per-component concern – choice of implementation Depends of resources, compatibility with other components in assembly Availability concern – what is the degree of redundancy? What replication styles to use? Does it apply to whole assembly? Failure recovery concern – what is the unit of failover? State synchronization concerns – What is data-sync rate? Deployment concern – how to place components? Minimize failure risk to the system Sumant, when we show the slide on data sync rate, we are showing state sync between the same operational string. We need to show it among replicas. So show state synch happening between FOUs or something like this.

4 Tangled Fault-Tolerance Concerns
Implementation determines replication style and vice-versa Replication degree affects resources and deployment Replication style determines state synchronization style Availability of domain artifacts determines deployment Significant sources of variability that affect end-to-end QoS (performance + availability) Too many animations Separation of Concerns using higher level abstractions is the key Design-time Deployment-time Run-time

5 Model-Driven Engineering – A Promising Approach
Higher level of abstraction than third generation programming languages Modeling each concern separately alleviates system complexity Deployment model Component assembly model System structural model Different QoS models e.g., Fault-tolerance Generative and model transformation techniques to weave in appropriate glue code Complex System

6 Fault-tolerance Modeling Abstractions in MDDPro
CQML (Component QoS Modeling Language) A DSML in the CoSMIC tool suite Fail-over Unit (FOU): Abstracts away details of granularity of protection (e.g., Component, Assembly, App-string) Replica Group (RPG): Abstracts away fault-tolerance policy details (e.g., Active/passive replication, rate and topology of state-synchronization) Shared Risk Group (SRG): Captures associations related to failure risk. (e.g., shared power supply among processors, shared LAN) Protection granularity concerns State-synchronization concerns Component Placement constraints Interpreter (component placement constraint solver): Encapsulates an algorithm for component-node assignment based on replica distance metric Replica Distance Metric

7 Fault-Tolerance Model in CQML
CQML (Component QoS Modeling Language) A graphical QoS modeling language on top of a system composition language (e.g., PICML) Enhances system structure with QoS annotations (e.g., FOUs for granularity of protection) A FOU itself is a model and captures heartbeat frequency and replication groups A Replication group captures per component replication style, data synchronization rate

8 Fail-over Unit Example
Primary Component A B C primary IOR “Client” container/component server container/component server container/component server Replica FOU A’ B’ C’ secondary IOR container/component server Primary FOU Do we also call it a primary FOU Replica Component

9 Shared Risk Group Example
Ship_SRG DataCenter1_SRG DataCenter2_SRG Rack1_SRG Rack2_SRG Node1 (blade31) Node2 (blade32) Shelf1_SRG Shelf2_SRG Shelf1_SRG Blade30 Blade34 Blade29 Blade33 Blade36

10 Formulation of Replica Placement Problem
Define N orthogonal vectors, one for each of the distance values computed for the N components (with respect to a primary) and vector-sum these to obtain a resultant.  Compute the magnitude of the resultant as a representation of the composite distance captured by the placement .  Compute the distance from each of the replicas to the primary for a placement.  Record each distance as a vector, where all vectors are orthogonal. Add the vectors to obtain a resultant. Compute the magnitude of the resultant. Use the resultant in all comparisons (either among placements or against a threshold) Apply a penalty function to the composite distance (e.g. pair wise replica distance or uniformity) R1 P R2 R3 Too many animations. There will not be enough time for this.

11 Component Placement Example using SRGs
Replica 1 2 3 Ship_SRG DataCenter1_SRG DataCenter2_SRG Rack1_SRG Rack2_SRG Node1 (blade31) Node2 (blade32) Composite Distance Primary Shelf1_SRG Shelf2_SRG Shelf1_SRG Blade30 Blade34 Blade29 Blade33 Blade36

12 FT Modeling & Generative Steps
Model components and application strings in PICML Model Fail Over Units (FOUs) and Shared Risk Groups (SRGs) Determine deployment of primary components GME/PICML Model Information Domain, Deployment, SRG, and FOU injection Replica Placement Algorithm model FT Interpreter Augmented Deployment Plan Interpreter automatically injects replicas and associated CCM IOGRs 5. Distance-based constraint algorithm determines replica placement in deployment descriptors.

13 Fault-Tolerance Model in CQML (1/2)
Replica = 3 Min Distance = 4

14 Shared Risk Group Model in CQML

15 Generative Capabilities for Provisioning FT
Automatic injection of replicas Augmentation of deployment plan based on number of replicas Automatic injection of FT infrastructure components E.g. Collocated “heartbeat” (HB) component with every protected component. Automatic injection of connection meta-data Specialized connection setup for protected components (e.g. Interoperable Group References IOGR) HB Container M x N

16 Example of Automated Heartbeat Component Injection
Collocated heartbeat component Primary Component FPC intra-FOU heartbeat HB HB HB A B C “client” primary IOR container/component server container/component server container/component server IOGR Primary FOU periodic FPC heartbeat FPC secondary IOR HB HB HB Connection Injection A’ B’ C’ Replica Component container/component server container/component server container/component server Replica FOU

17 Configurable FT Infrastructure
Future Work Developing advanced constraint solver algorithms to incorporate multiple dimensions of constraints in component placement decision (e.g. resources, communication latency) Optimizing the number of generated heartbeat components for collocated, protected application components. Enhancing the DSL and the tools to capture the configurability required by the new Lightweight RT/FT CORBA specification. e.g. Enhancing the model interpreter to support a wide spectrum of established fault-tolerance mechanisms Enhancing working prototypes and evaluating them in representative DRE systems HB Container Try to see if you can give a flavor of what you want to do more advanced than these simpler things. Can you generate architectures, say more about multi QoS modeling, etc. Configurable FT Infrastructure

18 Tools available for download from
Concluding Remarks Model-Driven Engineering separates dependability concerns from other system development concerns Separation of concerns helps alleviate system complexity Model-based generative capabilities “compile” FT infrastructure (e.g. heartbeat components and connections) during model interpretation time and synthesize meta-data Tools available for download from

19 Questions?


Download ppt "International Service Availability Symposium (ISAS) 2007"

Similar presentations


Ads by Google