The Specification-Consistent Coordination Model (SCCM) and its applications to Byzantine Failures.

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

Intermediate Code Generation
Lecture 19: Parallel Algorithms
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
Data Dependencies Describes the normal situation that the data that instructions use depend upon the data created by other instructions, or data is stored.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Lecture 3: Parallel Algorithm Design
The Assembly Language Level
Certification of Computational Results Greg Bronevetsky.
Programming Types of Testing.
Fundamentals of Python: From First Programs Through Data Structures
Creating Computer Programs lesson 27. This lesson includes the following sections: What is a Computer Program? How Programs Solve Problems Two Approaches:
Recursion. Recursion is a powerful technique for thinking about a process It can be used to simulate a loop, or for many other kinds of applications In.
CS 355 – Programming Languages
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Reference: Message Passing Fundamentals.
Recursion Chapter 7. Spring 2010CS 2252 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn how.
Algorithms and Problem Solving-1 Algorithms and Problem Solving.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
©The McGraw-Hill Companies, Inc. Permission required for reproduction or display. slide 1 CS 125 Introduction to Computers and Object- Oriented Programming.
1 Lecture 25: Parallel Algorithms II Topics: matrix, graph, and sort algorithms Tuesday presentations:  Each group: 10 minutes  Describe the problem,
Algorithms and Problem Solving. Learn about problem solving skills Explore the algorithmic approach for problem solving Learn about algorithm development.
DS -V - FDT - 1 HUMBOLDT-UNIVERSITÄT ZU BERLIN INSTITUT FÜR INFORMATIK Zuverlässige Systeme für Web und E-Business (Dependable Systems for Web and E-Business)
Fall 2007CS 2251 Recursion Chapter 7. Fall 2007CS 2252 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method.
Program Flow Charting How to tackle the beginning stage a program design.
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
Guide To UNIX Using Linux Third Edition
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
PRE-PROGRAMMING PHASE
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Fundamentals of Python: From First Programs Through Data Structures
JS Arrays, Functions, Events Week 5 INFM 603. Agenda Arrays Functions Event-Driven Programming.
Fundamentals of Python: First Programs
5.1 and 5.4 through 5.6 Various Things. Terminology Identifiers: a name representing a variable, class name, method name, etc. Operand: a named memory.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
CSC 201 Analysis and Design of Algorithms Lecture 03: Introduction to a CSC 201 Analysis and Design of Algorithms Lecture 03: Introduction to a lgorithms.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
1 Program Correctness CIS 375 Bruce R. Maxim UM-Dearborn.
Recursion Chapter 7. Chapter Objectives  To understand how to think recursively  To learn how to trace a recursive method  To learn how to write recursive.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
Chapter 12 Recursion, Complexity, and Searching and Sorting
Stephen P. Carl - CS 2421 Recursion Reading : Chapter 4.
DEBUGGING. BUG A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected.
Problem Solving Techniques. Compiler n Is a computer program whose purpose is to take a description of a desired program coded in a programming language.
Data Structures and Algorithms Introduction to Algorithms M. B. Fayek CUFE 2006.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
CSC 221: Recursion. Recursion: Definition Function that solves a problem by relying on itself to compute the correct solution for a smaller version of.
Data Structures R e c u r s i o n. Recursive Thinking Recursion is a problem-solving approach that can be used to generate simple solutions to certain.
CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky.
The Hashemite University Computer Engineering Department
8.1 8 Algorithms Foundations of Computer Science  Cengage Learning.
FUNCTIONS. Midterm questions (1-10) review 1. Every line in a C program should end with a semicolon. 2. In C language lowercase letters are significant.
1 Recursion Recursive function: a function that calls itself (directly or indirectly). Recursion is often a good alternative to iteration (loops). Its.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
JAVA: An Introduction to Problem Solving & Programming, 6 th Ed. By Walter Savitch ISBN © 2012 Pearson Education, Inc., Upper Saddle River,
Recursion.
Recursion Topic 5.
Lecture 3: Parallel Algorithm Design
Lesson #6 Modular Programming and Functions.
Lesson #6 Modular Programming and Functions.
Parallel Programming By J. H. Wang May 2, 2017.
Lesson #6 Modular Programming and Functions.
Algorithm design and Analysis
Asst. Dr.Surasak Mungsing
Lesson #6 Modular Programming and Functions.
Presentation transcript:

The Specification-Consistent Coordination Model (SCCM) and its applications to Byzantine Failures

The Byzantine Failure Problem In a large multi-processor, internal breakdowns are expected to be common events. Most of these breakdowns will result in complex behavior where the processor will return incorrect output. Given the source code of the program, we want to be able to detect and recover from such failures. SCCM may be a way to do this.

The Goals of SCCM SCCM was originally intended as an aid to programmers. It divides the programming task into two stages A Specification that rigorously defines the algorithm. A Coordination, which defines the actual imperative program and associates it with the Specification to ensure correctness. SCCM employs a runtime checker to ensure that the imperative program being executed matches the specification.

SCCM – Specification SCCM defines the algorithm via a functional language. For every piece of information that arises during the algorithm's lifetime, there is a function with a particular argument value to identify it. Example: sum of all the numbers in array A: input A[] sum(i) = sum(i-1) + A[i] output sum(A.length) Each intermediate value is identified by sum(i)

SCCM – Coordination An imperative version of this program: var ImpSum = 0 Array A[] = {1, 2, 3, 4, 5, 6} for i=1 to A.length ImpSum = ImpSum + A[i] Output ImpSum We can ensure that this program matches the Specification by associating with every value of ImpSum a corresponding identifier sum(i).

Named Values SCCM works by naming each piece of mutable storage with some f(x) from the specification. It maintains correctness by ensuring that when the imperative program overwrites values, it transforms their names in a way consistent with the specification. Because all values have names and names may only be transformed in consistent ways, SCCM ensures that the implementation's control flow is the same as in the specification.

Summation's Named Values In command ImpSum = ImpSum + A[i], we would use the definition of sum() to transform ImpSum's name from sum(i) to sum(i+1). Sum(): sum(i) = sum(i-1) + A[i] ImpSum's values and names:

Fibonacci Sequence The algorithm specification for the Fibonacci Sequence is simple: fib(0) = 1 fib(1) = 1 fib(i) = fib(i-1) + fib(i-2) An implementation will have to name each of its values with some fib(i) and only use the above rule to transform values.

Fibonacci – Specification The source code of the specification of the Fibonacci Sequence algorithm.

Fibonacci – Coordination

The Consistency Link Full Application: “A:=fib(0)” Here, fib() is actually called with 0 as the argument and its return value, 1 is assigned to A. fib(0) is now A's name. Fetch Application: “C:=fib(i) <- A” The value and name of A are copied into C. SCCM makes sure that before the copy, A's name is fib(i).

The Consistency Link 2 Step Application: “B:=fib(i+2)|(fib.l1 <-A, fib.l2 <-C)” fib() is executed to obtain the value of B. Rather than wastefully recursively calling fib(i+1) and fib(i), SCCM pulls those values from A and C. It ensures that the name in A is fib(i+1) and C's name is f(i). Thus, B gets its value and SCCM ensures that proper control flow was maintained.

Potential Coding Errors Errors in the imperative program are caught. Example: Setting loop bounds to (0,n) rather than (0, n-1) results in fib(n+1) being output rather than fib(n). SCCM detects this error. In general, it is hard to make errors in both the specification and the coordination that match each other.

SCCM Message Passing SCCM allows us to create parallel programs via message passing. We can send and receive SCCM named values, with SCCM ensuring global adherence to the specification. Both the Send and the Receive are checked.

SCCM – Send Sample Send: send n() endsend The value of N, named n() is sent out. We can send out single elements or lists of elements. SCCM makes sure that the values sent out actually have the names the Send command claims them to have.

SCCM – Receive Sample Receive: recv match n() := N endrecv The value of N, that was sent in the prior slide is received by the target processor. N's value must be named n(), just like in the send. All receives (as far as I can tell) are Receive- Any's.

Another Send Example Sample send command: send a(i, 2*i) <- A(i, i) for i in (1,3) to endsend The contents of 3 diagonal elements of A[][] are sent, named a(1,2), a(2,4), a(3,6) to the destination processor. SCCM checks that those are indeed the names in those diagonal elements.

Another Receive Example Sample receive command: recv check a(i, 2*i) =: B(i) match for i:int in (s,t) endrecv The diagonal elements of A[][] are now received. Their names must be the same but they may be saved into some other structure at the target processor. (like the array B[])

SCCM Performance When the same problem is implemented in C, SCCM and SML: SCCM is usually 6-9 time slower than C because of all the runtime checking overhead. SCCM is 50% faster than SML, because SCCM produces imperative programs that do not have SML's functional overheads.

Is SCCM useful for Programmers? The amount of time one spends writing a SCCM program is much larger than for a normal program. Arguably, this is less than the amount of time spent on debugging but writing a specification for a large system would be very hard. Most programmers would find it hard to express their algorithms in purely functional notation. Programs in SCCM are several times longer than their equivalents in C. Example: Bubble Sort.

SCCM for Byzantine Failures SCCM effectively captures a program's control flow. The price for the programmer is having to write a more complex program that is several times longer. We are trying to design compilers techniques that can verify whether a processor has faithfully executed a program. Thus, the added difficulty does not concern our purposes.

SCCM for Byzantine Failures We may be able to annotate a program so that after execution it can prove to us that it transformed all of its data according to the original source code. SCCM can be thought of a system for creating problem-specific type systems. Can we create a Linear-Algebra specific type system? Can Model Checking help us determine a program's legal set of data transformations?

Related Fields I. Certification Trails A Certification Trail is a trail of information a program leaves behind, describing its work. After the first program completes, a second program can use this trail to perform the same computation much more quickly. Thus, the certification trail for a program acts much like a checksum or parity bit for data. Little overhead is required. Problem: Currently this approach requires mostly manual work. No techniques exist for compilers to generate certification trails.

Related Fields II. Result Checking A subfield of CS Theory dealing with ways to probabilistically verify the correctness of an algorithm's output. Related to Interactive Proofs. Problems: Though the focus is on checkers that are asymptotically faster than the actual algorithm, most solutions are too inefficient to be used in practice. There is no general methodology for generating checkers for problems and most checkers in existence are for obscure and specialized problems.

Related Fields III. Replication Run the same program on multiple computers. Compare their output to protect from corruption. The only available solution to Byzantine Failures. Very resource inefficient. Most replication-based approaches require 3 times as many resources as unprotected systems. ED 4 I – run the same program twice with different data to detect permanent and transient faults. BFS – Replicated services. Processors Vote on results. Resilient to f faults by using 3f+1 replicas.

Gaussian Elimination