Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa.

Slides:



Advertisements
Similar presentations
Global States.
Advertisements

Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.
KAIS T The Vision of Autonomic Computing Jeffrey O. Kephart, David M Chess IBM Watson research Center IEEE Computer, Jan 발표자 : 이승학.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Self-stabilizing Distributed Systems Sukumar Ghosh Professor, Department of Computer Science University of Iowa.
Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.
Fabian Kuhn, Microsoft Research, Silicon Valley
Fabián E. Bustamante, Winter 2006 Autonomic Computing The vision of autonomic computing, J. Kephart and D. Chess, IEEE Computer, Jan Also - A.G.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Autonomic Systems Sukumar Ghosh Department of Computer Science The University of Iowa.
Altruistic Routing in P2P Networks: Solutions and Problems Sukumar Ghosh Alina Bejan Amlan Bhattacharya University of Iowa.
Towards a Logic for Wide-Area Internet Routing Nick Feamster and Hari Balakrishnan M.I.T. Computer Science and Artificial Intelligence Laboratory Kunal.
Safety and Liveness. Defining Programs Variables with respective domain –State space of the program Program actions –Guarded commands Program computation.
A Game-theoretic Approach to the Design of Self-Protection and Self-Healing Mechanisms in Autonomic Computing Systems Birendra Mishra Anderson School of.
Taming Dynamic and Selfish Peers “Peer-to-Peer Systems and Applications” Dagstuhl Seminar March 26th-29th, 2006 Stefan Schmid Distributed Computing Group.
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Autonomic Computing Shafay Shamail Malik Jahan Khan.
Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.
CS 603 Communication and Distributed Systems April 15, 2002.
70-291: MCSE Guide to Managing a Microsoft Windows Server 2003 Network Chapter 14: Troubleshooting Windows Server 2003 Networks.
01/16/2002 Reliable Query Reporting Project Participants: Rajgopal Kannan S. S. Iyengar Sudipta Sarangi Y. Rachakonda (Graduate Student) Sensor Networking.
Distributed Systems Sukumar Ghosh Department of Computer Science University of Iowa.
Mobile Databases: a Selection of Open Issues and Research Directions Authors: Rachid Guerraoui et al. Sources: SIGMOD Record, 33(2), pp.78-83, 2004 Adviser:
1 Autonomic Computing An Introduction Guenter Kickinger.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
WELCOME. AUTONOMIC COMPUTING PRESENTED BY: NIKHIL P S7 IT ROLL NO: 33.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
Chapter 9 Integrity. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.9-2 Topics in this Chapter Predicates and Propositions Internal vs.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
Snap-Stabilizing PIF and Useless Computations Alain Cournier, Stéphane Devismes, and Vincent Villain ICPADS’2006, July , Minneapolis (USA)
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Fault-Tolerant Parallel and Distributed Computing for Software Engineering Undergraduates Ali Ebnenasir and Jean Mayo {aebnenas, Department.
THE VISION OF AUTONOMIC COMPUTING. WHAT IS AUTONOMIC COMPUTING ? “ Autonomic Computing refers to computing infrastructure that adapts (automatically)
Self Stabilizing Smoothing and Counting Maurice Herlihy, Brown University Srikanta Tirthapura, Iowa State University.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Hwajung Lee. One of the selling points of a distributed system is that the system will continue to perform even if some components / processes fail.
Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.
Fault Management in Mobile Ad-Hoc Networks by Tridib Mukherjee.
University of Iowa1 Self-stabilization. The University of Iowa2 Man vs. machine: fact 1 An average household in the developed countries has 50+ processors.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
Self-Stabilizing Algorithm with Safe Convergence building an (f,g)-Alliance Fabienne Carrier Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Yvan Rivierre.
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
Faults and fault-tolerance One of the selling points of a distributed system is that the system will continue to perform even if some components / processes.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
Design of Tree Algorithm Objectives –Learning about satisfying safety and liveness of a distributed program –Apply the method of utilizing invariants and.
Classifying fault-tolerance Masking tolerance. Application runs as it is. The failure does not have a visible impact. All properties (both liveness & safety)
Computer Science 425/ECE 428/CSE 424 Distributed Systems (Fall 2009) Lecture 20 Self-Stabilization Reading: Chapter from Prof. Gosh’s book Klara Nahrstedt.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Self-stabilizing Overlay Networks Sukumar Ghosh University of Iowa Work in progress. Jointly with Andrew Berns and Sriram Pemmaraju (Talk at Michigan Technological.
AUTONOMIC COMPUTING B.Akhila Priya 06211A0504. Present-day IT environments are complex, heterogeneous in terms of software and hardware from multiple.
Distributed Systems Lecture 6 Global states and snapshots 1.
Self-Managing Computer Systems An Introduction. Giving credit where it is due: Most slides are from Mark Jelasity, University of Bologna, Italy I have.
Faults and fault-tolerance
第1部: 自己安定の緩和 すてふぁん どぅゔぃむ ポスドク パリ第11大学 LRI CNRS あどばいざ: せばすちゃ てぃくそい
New Variants of Self-Stabilization
Faults and fault-tolerance
Self-Managed Systems: an Architectural Challenge
Corona Robust Low Atomicity Peer-To-Peer Systems
Autonomic Pervasive Systems
Presentation transcript:

Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa

Background Autonomic systems are characterized by a number of properties that exhibit its ability of self-management. Collectively known as self-* properties

Goal Self-organizing Self-stabilizing Self-optimizing Self-adaptive Self-healing Self-scaling Self-managing What do they precisely mean? How do they differ? Can we find some common framework to satisfy different characterizations of the various self-star properties?

The model of a system Network of processes: topology G = (V, E) Processes execute actions. Each action by a process changes its local state. Global state S is the collection of local states. A computation is a sequence of global states.

What is a “good” system Safety. Property P must always hold Bad things never happen. Example: no deadlock, at least one token must always exist etc. Liveness. Property Q must eventually hold Good things eventually happen. Example: termination, convergence, progress etc. A system configuration is legal when these properties hold

The Environment System Environment Consists of variables that a process can read but not modify. A system is legal when it satisfies its safety properties. Legality is defined with respect to an environment Time of day Network topology User demands for service Output from another system Failures etc

System vs. adversary Adversary Disrupts or challenges the system The adversary causes failures, perturbation, allows processes to join or leave without notice, changes global state, launches security attacks, changes the environment etc. Ultimately the system must win. System Environment

Tolerance to Adversarial actions Masking. Safety and Liveness Properties are NEVER violated. The system always remains in a legal configuration Non masking. Safety properties (but not Liveness) are temporarily violated, but eventually restored. The system state may temporarily become illegal Self-management = tolerance to adversarial actions

Masking vs. Non-masking Legal State space

More on tolerance Tolerance to adversarial actions also depends on its type and extent. A system may be masking tolerant to a single crash failure, but exhibit non-masking tolerance to multiple failures

Graceful degradation The system negotiates the adversarial action, but recovers to a configuration that satisfies a predicate P’ ⊃ P’ (P= original safety predicate). To be graceful, P’ predicate must be acceptable to the application.

Types of actions Actions can be internal or external. External actions can change the environment. Processes execute internal actions only The adversary executes both internal and external actions.

Self-management Self-management is a vision. It encompasses all self-* properties. Typically attributed to systems that exhibit at least one self-star property.

The framework Generally, in defining a specific self-* property, the important issues are:  Interpretation of the legal configuration  type of adversarial action  Type of tolerance permitted like masking, non-masking, or graceful degradation

Self-stabilization Starting from an arbitrary configuration, a self-stabilizing system eventually recovers to a legal configuration (satisfies a predefined predicate P) and remains in that configuration thereafter. Starting from an arbitrary configuration, a self-stabilizing system eventually recovers to a legal configuration (satisfies a predefined predicate P) and remains in that configuration thereafter. Adversarial action: transient failure corrupting the system state Tolerance: non-masking Adversarial action: transient failure corrupting the system state Tolerance: non-masking

Self-adaptation if R k then  P k will hold Can be viewed as an extension of a self- stabilizing system, where the legal configuration satisfies the predicate P = ∨ ( R i ∧ P i ) if R k then  P k will hold Can be viewed as an extension of a self- stabilizing system, where the legal configuration satisfies the predicate P = ∨ ( R i ∧ P i ) Environment ∈ {R 1 R 2 …, R m }. A system adapts to an environment R. Environment R P i Process j crashes implies that adversary changes the environment variable crashed (j) from false to true

Self-healing A system is self-healing with respect to a subset of external actions if occurrence of those actions cause at most a temporary violation of the system’s legal configuration (safety Property P) Adversarial action: a subset of all possible external actions Tolerance: typically non-masking, but masking not ruled out Adversarial action: a subset of all possible external actions Tolerance: typically non-masking, but masking not ruled out

Comments on Self-healing A self-healing system may not be self-healing with respect to an enlarged set of adversarial actions. Skype is a Self-healing system, but it crashed on August 16, 2007 and was down for nearly two days. Why? A self-healing system may not be self-healing with respect to an enlarged set of adversarial actions. Skype is a Self-healing system, but it crashed on August 16, 2007 and was down for nearly two days. Why?

Comments on Self-healing Also, self-healing frequently leads to graceful degradation.

Self-organization A system is self-organizing with respect to a subset of external actions involving process join and leave if those actions cause at most a temporary violation of the system’s legal configuration (safety Property P) Adversarial action: join / leave actions (up to N processes may concurrently join or N/2 processes may concurrently leave) Tolerance: usually non-masking, but masking not ruled out. Adversarial action: join / leave actions (up to N processes may concurrently join or N/2 processes may concurrently leave) Tolerance: usually non-masking, but masking not ruled out.

Self-organization

Self-organization An example of gathering

Comments on Self-organization A self-organizing system is expected to recover in a reasonable time. [1] imposed a requirement of sub-linear recovery time per join or leave operation [1] Dolev & Tzachar: Empire of Colonies, Theoretical Computer Science 2009 A self-organizing system is expected to recover in a reasonable time. [1] imposed a requirement of sub-linear recovery time per join or leave operation [1] Dolev & Tzachar: Empire of Colonies, Theoretical Computer Science 2009

Self-protection A system is self-protecting with respect to a set of malicious external actions if it maintains its legal configuration (data integrity and continued functionality) in the presence of those actions. Adversarial action: malicious actions Tolerance: masking. Comment: Hard to characterize what a malicious action is. It may be a direct security attack, or something very subtle. Adversarial action: malicious actions Tolerance: masking. Comment: Hard to characterize what a malicious action is. It may be a direct security attack, or something very subtle.

Self-optimization A system is self-optimizing when starting from an initial configuration if it spontaneously improves / maximizes the value of an objective function (cost) relevant to the systems performance Adversarial action: A bad initialization, or an interim action that makes the current configuration sub-optimal. Tolerance: Non-masking. Comment: What if different nodes have different perceptions of cost? Selfishness adds a new dimension. Adversarial action: A bad initialization, or an interim action that makes the current configuration sub-optimal. Tolerance: Non-masking. Comment: What if different nodes have different perceptions of cost? Selfishness adds a new dimension.

Self-optimization with selfish agents Selfish actions used to optimize a system may never reach an equilibrium configuration 1, 34, 3 3, 10 1, 31, 10 3, 1 4,1 1, 10 From a game theory perspective, no Nash Equilibrium exists root Shortest path tree with two different types of processes

Self-configuration The legal configuration is defined over the configuration space: Various notions of configuration, like a set of optimal choices of hardware or software modules and connections among them, which is consistent with the environment. Adversarial action: A subset of external actions. Tolerance: Non-masking. Adversarial action: A subset of external actions. Tolerance: Non-masking.

Self-configuration 1 User

Self-configuration 2 User

Relationships among self-star properties Self-stabilization implies self-healing with respect to any adversarial internal action Self-organization implies self-healing with respect to join and leave operations

Relationships among self-star properties Self-organization implies self-configuration but the reverse is not true For example, a self-configuring web-server changes the connection between server components, processor cycles and memory capacity to provide a stable response [2], but is not self-organizing since it cannot automatically integrate another server [2] (Wildstrom et al ICAC 2005)

Relationships among self-star properties Self-organizationSelf-stabilization A self-stabilizing system that allows a process to join or leave a system of N processes is O(N 2 ) time is not self-organizing (some will disagree)

Relationships among self-star properties Chord P2P network is self-organizing, but not self- stabilizing, since once an adversarial action splits the Chord ring into two rings, they do not join.

New property: self-immunity A system is self-immune with respect to an action C, if (1) initially tolerance to action C is non-masking, but (1) eventually the system is able to mask the effect of action C The system learns from experience, and becomes smarter with time

Self-immune behavior Legal State space

New property: self-containment Self-containment is a variant of self-protection. It prevents the total system from being compromised by external malicious actions. At most a fraction of the system is compromised, but eventually the non-compromised processes are able to offer a meaningful level of service. (The system has the ability of damage control by saving a part of it in spite of a security attack. It is a non-masking version of self- protection, similar in spirit with the fault-containment property of self-stabilizing systems)

Self-containment

Conclusion There is a need for a framework to define what we actually mean by specific self-star property. This will not satisfy everyone’s vision, but as long as it satisfies the majority’s view, the chance of further divergence of views is minimized.