Deploying Analytical Redundancy for System Fault Tolerance V. Cortellessa, D. Del Gobbo, A. Mili, M. Shereshevsky, and Z. Zhuang CSEE Dept. West Virginia.

Slides:



Advertisements
Similar presentations
Modeling of Complex Social Systems MATH 800 Fall 2011.
Advertisements

Modeling and Simulation By Lecturer: Nada Ahmed. Introduction to simulation and Modeling.
Process Database and Process Capability Baseline
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
11. Practical fault-tolerant system design Reliable System Design 2005 by: Amir M. Rahmani.
Modeling and simulation of systems Slovak University of Technology Faculty of Material Science and Technology in Trnava.
Markov Analysis Jørn Vatn NTNU.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
Decision Making: An Introduction 1. 2 Decision Making Decision Making is a process of choosing among two or more alternative courses of action for the.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
This presentation can be downloaded at Water Cycle Projections over Decades to Centuries at River Basin to Regional Scales:
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Mutual Information Mathematical Biology Seminar
Chapter 2: Pattern Recognition
2015/6/15VLC 2006 PART 1 Introduction on Video Coding StandardsVLC 2006 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
Dependability Evaluation. Techniques for Dependability Evaluation The dependability evaluation of a system can be carried out either:  experimentally.
Chapter Sampling Distributions and Hypothesis Testing.
Shannon ’ s theory part II Ref. Cryptography: theory and practice Douglas R. Stinson.
Lecture 2: Basic Information Theory Thinh Nguyen Oregon State University.
2015/7/12VLC 2008 PART 1 Introduction on Video Coding StandardsVLC 2008 PART 1 Variable Length Coding  Information entropy  Huffman code vs. arithmetic.
A Progressive Fault Tolerant Mechanism in Mobile Agent Systems Michael R. Lyu and Tsz Yeung Wong July 27, 2003 SCI Conference Computer Science Department.
Information Theory and Security
Software Process and Product Metrics
Lecture 24 Introduction to state variable modeling Overall idea Example Simulating system response using MATLAB Related educational modules: –Section 2.6.1,
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
1 of 25 The EPA 7-Step DQO Process Step 5 - Define Decision Rules 15 minutes Presenter: Sebastian Tindall DQO Training Course Day 2 Module 14.
March 13, 2001CSci Clark University1 CSci 250 Software Design & Development Lecture #15 Tuesday, March 13, 2001.
Introduction to Discrete Event Simulation Customer population Service system Served customers Waiting line Priority rule Service facilities Figure C.1.
1 Introduction to Modeling Languages Striving for Engineering Precision in Information Systems Jim Carpenter Bureau of Labor Statistics, and President,
Development of An ERROR ESTIMATE P M V Subbarao Professor Mechanical Engineering Department A Tolerance to Error Generates New Information….
IV&V Facility 1 FY2002 Initiative: Software Architecture Metrics Hany Ammar, Mark Shereshevsky, Nicholay Gradetsky, Diaa Eldin Nassar, Walid AbdelMoez,
Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.
Software Reliability SEG3202 N. El Kadri.
IV&V Facility PI: Katerina Goseva – Popstojanova Students: Sunil Kamavaram & Olaolu Adekunle Lane Department of Computer Science and Electrical Engineering.
Digital Signal Processing
Benjamin Gamble. What is Time?  Can mean many different things to a computer Dynamic Equation Variable System State 2.
Object-Oriented Analysis and Design An Introduction.
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Universit at Dortmund, LS VIII
1 Introduction to Software Engineering Lecture 1.
COMMUNICATION NETWORK. NOISE CHARACTERISTICS OF A CHANNEL 1.
Protection vs. false targets in series systems Reliability Engineering and System Safety(2009) Kjell Hausken, Gregory Levitin Advisor: Frank,Yeong-Sung.
Image Compression – Fundamentals and Lossless Compression Techniques
CS 4850: Senior Project Fall 2014 Object-Oriented Design.
1 S ystems Analysis Laboratory Helsinki University of Technology Flight Time Allocation Using Reinforcement Learning Ville Mattila and Kai Virtanen Systems.
Enabling Reuse-Based Software Development of Large-Scale Systems IEEE Transactions on Software Engineering, Volume 31, Issue 6, June 2005 Richard W. Selby,
1 Information Theory Nathanael Paul Oct. 09, 2002.
Research Heaven, West Virginia FY2003 Initiative: Hany Ammar, Mark Shereshevsky, Walid AbdelMoez, Rajesh Gunnalan, and Ahmad Hassan LANE Department of.
LECTURE 5 HYPOTHESIS TESTING EPSY 640 Texas A&M University.
1 Report on results of Discriminant Analysis experiment. 27 June 2002 Norman F. Schneidewind, PhD Naval Postgraduate School 2822 Racoon Trail Pebble Beach,
1 of 27 The EPA 7-Step DQO Process Step 5 - Define Decision Rules (15 minutes) Presenter: Sebastian Tindall Day 2 DQO Training Course Module 5.
Fault Tolerance Benchmarking. 2 Owerview What is Benchmarking? What is Dependability? What is Dependability Benchmarking? What is the relation between.
MODEL-BASED SOFTWARE ARCHITECTURES.  Models of software are used in an increasing number of projects to handle the complexity of application domains.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Elements of a Discrete Model Evaluation.
Structuring Redundancy for Fault Tolerance Chapter 2 Designed by: Hadi Salimi Instructor: Dr. Mohsen Sharifi.
2009/6/30 CAV Quantifier Elimination via Functional Composition Jie-Hong Roland Jiang Dept. of Electrical Eng. / Grad. Inst. of Electronics Eng.
OPERATING SYSTEMS CS 3502 Fall 2017
Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad.
Security SIG in MTS 05th November 2013 DEG/MTS RISK-BASED SECURITY TESTING Fraunhofer FOKUS.
What is a Database and Why Use One?
Software Connectors – A Taxonomy Approach
Fault Tolerance Distributed
Review and comparison of the modeling approaches and risk analysis methods for complex ship system. Author: Sunil Basnet.
Lesson 6a – Introduction to Functions: Concepts and Notations
Presentation transcript:

Deploying Analytical Redundancy for System Fault Tolerance V. Cortellessa, D. Del Gobbo, A. Mili, M. Shereshevsky, and Z. Zhuang CSEE Dept. West Virginia University - Morgantown FY2001 University Software Initiative for the NASA IV&V Facility - Fairmont WV

Outline Characterizing Redundancy Quantifying Redundancy Qualifying Redundancy

CHARACTERIZING REDUNDANCY

Objectives To develop a classification of redundancy by identifying the orthogonal dimensions in redundancy To analyze physical and analytical redundancy on the basis of the obtained classification To answer general questions about redundancy: –What is redundancy? –Can we talk about redundancy outside the context of fault tolerance? –Can we distinguish between intrinsic redundancy and redundancy- by-design? –Is redundancy a representation issue or a design issue? –Is physical redundancy an extreme case of redundancy?

Definition of Redundancy From IEEE Dictionary –duplication of elements for the purpose of enhancing system reliability –presence of auxiliary components in a system for the purpose of preventing or recovering from failures –the existence of more than one means for performing a given function –pertaining to characters that do not contribute to the information content –Log (# symbols) - average information content per symbol

Definition of Redundancy Functional vs. State Redundancy State redundancy –system state [x 0, x 1, … x n ] (implementation dependent) Functional redundancy –System level requirements R={(u,y)| …} –Subsystem/component level requirements R={(x i, x j )|…} (implementation dependent)

Content Redundancy English language sentence (Shannon) No redundancy –symbols are independent and equiprobable First-level redundancy –symbols are independent but with frequency of English text –digram structure as in English text –trigram structure as in English text Word redundancy –words are independent but with frequency of English text –word transition probability is that of English text

Content Redundancy Physical system Rigid body in free fall ( p, v, a, F, M) No redundancy –quantities are independent and each uniformly distributed Local redundancy (quantities are still independent) –each quantity is assigned a probability distribution –relationship among each quantity at different time instants System redundancy –instantaneous dependency between different quantities –temporal dependency between different quantities

Representation Redundancy Parity-bit Information in order to be processed needs to be represented in some suitable manner The parity-bit in serial communication allows detecting non-admissible strings of bits. Admissibility of the string of bits is independent of the information content

Temporal/Sequential Redundancy Some applications are characterized by a sequential introduction of data Shannon’s example –first-order redundancy is a single-step redundancy –following orders of redundancy are multiple-step Physical system example –F(t i ) = M(t i )a(t i ) is single-step (instantaneous) redundancy –v(t 2 ) = [p(t 2 )-p(t 1 )]/(t 2 -t 1 ) is multiple-step (temporal) redundancy

Analytical Redundancy System/Subsystem/component level functional redundancy State redundancy Content redundancy Representation redundancy Single/multiple-step redundancy

Physical Redundancy Component level functional redundancy State redundancy Content redundancy Representation redundancy Single-step redundancy (deterministic asset)

QUANTIFYING REDUNDANCY

Objectives To quantify the amount of redundancy by means of a numeric function To characterize analytical vs physical redundancy by means of this function To characterize Fault Tolerance Capabilities (e.g., detection, identification, etc.) by means of this function Use this function to support decision making in redundancy vs Fault Tolerant Capability tradeoffs

Redundancy as the ability to choose among representations X : system state P : set of all the “possible” system states C : set of all the “correct” system states Prob ( X  C | X  P ) The corresponding conditional entropy is a suitable metric of “how fully the potential domain is being exploited” (or, conversely, how sparsely populated it is), i.e. how much redundancy the system shows in terms of unused possible states

Redundancy as logical relation among state variables State made up of two (aggregate of) variables, say X and Y P(X|Y) : to what extent the value of Y determines the values of X H(X|Y) : Amount of uncertainty that remains about X if we know Y H(X|Y) = H(X,Y) – H(Y)

A simple example a : system variable SYSTEM  : vector of readings of a Hypothesis: there is redundancy only if  uniquely determines a H( a |  ) = 0 ( = H( a,  ) – H(  ) ) a  f  a : P(f -1 ( a )) = P( a )

This property holds: H( a )  H(  ) and the distance depends on the injectivity of f (e.g., one-to-one mapping gives H( a ) = H(  ) ) Again we may consider, as a measure of redundancy:  (  ) = H(  ) - H( a ) ( = H(  | a ) ) i.e., how fully the potential domain of values is being exploited.

 (  ) = H(  ) - H( a ) We voluntarily omit a as a parameter of  because: P( a ) comes from the intrinsic system operational profile (there is no control on it) while P(  ) is the result of design choices and fault hypotheses (its value can be controlled by design)

QUALIFYING REDUNDANCY

Objectives Whereas the previous section quantifies redundancy, this section qualifies it. The same amount of redundancy may or may not be useful, depending on functional properties Whereas in quantifying redundancy we need to distinguish between correct and representable (possible) states, in this section we will distinguish between: –Correct states –Maskable states –Recoverable states –Representable states

Notation s 0 : system initial state milestone: breaking point between past and future behavior of the system  : relation that describes the past behavior  : relation that describes the future behavior  : system requirements

s0s0    (s0)(s0) milestone s is a correct state: (s 0,s)      (s 0 )

s0s0    (s0)(s0) maskable milestone (s 0,s)  K ( ,  ) s is a maskable state:    (s 0 )

   (s0)(s0) maskable ’’ r milestone s is a recoverable state: s0s0  r :  ’ r K ( ,  )    (s 0 )

Question For what  ’ and K this equation has a solution? Analogy: for what a,b does the equation ax=b have a solution? Answer: a  0  r :  ’ r K ( ,  )

Answer: conditions for existence of r - C1 - K L   ’ L - C2 - (K L   ’)^ K must be a total relation In practice, we look for the smallest  ’ s.t. C1 and C2 hold (i.e., the relation that maps initial to recoverable states only) - C1 - K L =  ’ L - C2 -  ’ K must be a total relation

A sufficient condition for C2 If the domain partition determined by K is preserved by  ’ then condition C2 holds  ’  ’  K K   ’ K is a total relation A simple example K = { (s,s’) | s’ = s mod 6}  ’ 1 = { (s,s’) | s’ = s mod 12}  ’ 2 = { (s,s’) | s’ = (s+5) mod 18} Only produces recoverable states recovery: s’ = s mod 6 Only produces recoverable states recovery: s’ = (s+1) mod 6  ’ 3 = { (s,s’) | s’ = s mod 10} It does not produce recoverable states

Conclusions and Future Work We have developed a framework for reasoning about redundancy It includes: Classification/Quantification/Qualification Future work –Refining/reorganizing classification –Evaluate quantification –Validate qualification