Presentation is loading. Please wait.

Presentation is loading. Please wait.

Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.

Similar presentations


Presentation on theme: "Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated."— Presentation transcript:

1 Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated Systems Jaiganesh Balasubramanian jai@dre.vanderbilt.edu www.dre.vanderbilt.edu/~jai Dr. Aniruddha Gokhale gokhale@dre.vanderbilt.edu www.dre.vanderbilt.edu/~gokhale Dr. Douglas C. Schmidt schmidt@dre.vanderbilt.edu www.dre.vanderbilt.edu/~schmidt Dr. Sherif Abdelwahed sherif@isis.vanderbilt.edu www.isis.vanderbilt.edu/~sherif

2 2 Key characteristics of the solution space Vast accidental & inherent complexities Continuous evolution & change Highly heterogeneous (& legacy constrained) platform, language, & tool environments Key characteristics of the problem space Network-centric, dynamic, very large-scale “systems of systems” Stringent simultaneous QoS demands, e.g., “never die,” time-critical, etc. Highly diverse, complex, & increasingly integrated/autonomous application domains Ultra-Large Scale (ULS) System Characteristics Mapping & integrating problem artifacts to solution artifacts is hard

3 3 Motivating Scenario for ULS Impact of Service-Oriented Architectures on enterprise distributed real-time & embedded (DRE) ULS systems Applications composed of an “operational string” of services A service is an assembly of components Dynamic (re)deployment of services into operational strings is necessary Performability = performance + survivability requirements Key challenges Regulating & adapting to (dis)continuous changes in runtime environments e.g., online prognostics, dependable upgrades Satisfying tradeoffs between multiple (often conflicting) QoS demands e.g., secure, real-time, reliable, etc. Satisfying QoS demands in face of fluctuating and/or insufficient resources e.g., mobile ad hoc networks (MANETs)

4 4 Some Performability Challenges for ULS Systems Performability challenges in dynamic provisioning of operational strings & services Service workloads & resource capacity issues – service placement depends on workloads & available resources Service accessibility patterns – service survivability depends on its sharing degree Differentiated levels of QoS – affects resource provisioning & survivability strategies Operational string & service failover – different failover possibilities e.g., as a whole or part operational string or one service at a time No one-size-fits-all dependability strategy – cannot dictate one survivability strategy on all services & operational strings Application performability addressed by resolving service placement & survivability problems

5 5 Model of Approach Model addresses various concerns: Per-service concern: Choice of implementation Depends on resources, compatibility with other components in assembly Coupling concern: Choice of invocation & communication mechanism used Sharing concern: Shared services will need proactive survivability since it affects several services simultaneously Failure recovery concern: What is the unit of failover? Availability concerns: What is the degree of redundancy? What replication styles to use? Does it apply to whole assembly? Deployment concerns: How to select resources? How much sharing? Assembly concerns: What components to assemble dynamically? Configurations & optimizations for end-to-end performability? Service placement & service survivability strategies address these concerns

6 6 Addressing the Service Placement Problem Service placement algorithms must consider tradeoffs between providing performance to applications & providing survivability to applications, allocating resources either to primaries or replicas Service placement problem must consider: Set of computation nodes attributed by: Processing index or capacity Memory index or capacity Survivability index Set of communication links attributed by: Bandwidth index Survivability index Set of components attributed by: Different implementations offering performance tradeoffs across quality dimensions Different implementations consuming various amounts of resources Constraints on being deployed as an assembly to offer a complete service Replica placement issues involve: Different availability requirements for different assemblies of components: Multiple replicas needed, tolerate non-availability of replicas based on importance of assemblies Replica resource provisioning depending on replication schemes used Load balancing of replicas if resources available but introduce run-time problems on consistency

7 7 Addressing the Survivability Problem A configurable approach to survivability including micro- (infrastructure) & macro- (assembly & operational string) level strategies Micro-level strategies monitor infrastructure state to make proactive decisions at Component level (swapping & migration) Middleware level (configurations) Component Server Level (process resource allocations) Node level (multiple components) Macro-level strategies monitor assembly health to make failover decisions Failover based on type of failover unit Affects service placement decisions May involve load balancing State synchronization issues Replication styles (hidden by FT strategies) Initial prototype developed using Component-Integrated ACE ORB (CIAO) & Deployment & Configuration Engine (DAnCE) (www.dre.vanderbilt.edu)www.dre.vanderbilt.edu Future work on Data Distribution Service (DDS) & Distributed Real-time Specification for Java (DRTSJ)


Download ppt "Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated."

Similar presentations


Ads by Google