Certification of Computational Results Greg Bronevetsky.

Slides:



Advertisements
Similar presentations
Lecture 6 Sept 11, 2008 Goals for the day: Linked list and project # 1 list class in STL (section 3.3) stack – implementation and applications.
Advertisements

Introduction to Algorithms Quicksort
Single Source Shortest Paths
COL 106 Shweta Agrawal and Amit Kumar
The Dictionary ADT Definition A dictionary is an ordered or unordered list of key-element pairs, where keys are used to locate elements in the list. Example:
M180: Data Structures & Algorithms in Java
Heaps1 Part-D2 Heaps Heaps2 Recall Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is a pair (key, value)
Lecture 3: Parallel Algorithm Design
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
2/14/13CMPS 3120 Computational Geometry1 CMPS 3120: Computational Geometry Spring 2013 Planar Subdivisions and Point Location Carola Wenk Based on: Computational.
Data Structures: A Pseudocode Approach with C
1 Theory I Algorithm Design and Analysis (10 - Shortest paths in graphs) T. Lauer.
Convex Hull obstacle start end Convex Hull Convex Hull
CS16: Introduction to Data Structures & Algorithms
Convex Hulls in Two Dimensions Definitions Basic algorithms Gift Wrapping (algorithm of Jarvis ) Graham scan Divide and conquer Convex Hull for line intersections.
Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,
Priority Queues and Heaps. Overview Our last ADT: PriorityQueueADT A new data structure: heaps One more sorting algorithm: heapsort Priority Queues and.
Recursion. Binary search example postponed to end of lecture.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
Algorithms for Precomputing Constrained Widest Paths and Multicast Trees Paper by Stavroula Siachalou and Leonidas Georgiadis Presented by Jeremy Witmer.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
The Specification-Consistent Coordination Model (SCCM) and its applications to Byzantine Failures.
Recursion.
Lecture 6: Point Location Computational Geometry Prof. Dr. Th. Ottmann 1 Point Location 1.Trapezoidal decomposition. 2.A search structure. 3.Randomized,
Voronoi Diagrams.
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Lection 1: Introduction Computational Geometry Prof.Dr.Th.Ottmann 1 History: Proof-based, algorithmic, axiomatic geometry, computational geometry today.
Advanced Algorithm Design and Analysis (Lecture 9) SW5 fall 2004 Simonas Šaltenis E1-215b
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
CSCE 3110 Data Structures & Algorithm Analysis Binary Search Trees Reading: Chap. 4 (4.3) Weiss.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
AITI Lecture 20 Trees, Binary Search Trees Adapted from MIT Course 1.00 Spring 2003 Lecture 28 and Tutorial Note 10 (Teachers: Please do not erase the.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 19: Searching and Sorting Algorithms.
Chapter 13 Recursion. Learning Objectives Recursive void Functions – Tracing recursive calls – Infinite recursion, overflows Recursive Functions that.
COMP20010: Algorithms and Imperative Programming Lecture 4 Ordered Dictionaries and Binary Search Trees AVL Trees.
CSC 211 Data Structures Lecture 13
CS717 Algorithm-Based Fault Tolerance Matrix Multiplication Greg Bronevetsky.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
PRIORITY QUEUES AND HEAPS CS16: Introduction to Data Structures & Algorithms Tuesday, February 24,
1 Heaps (Priority Queues) You are given a set of items A[1..N] We want to find only the smallest or largest (highest priority) item quickly. Examples:
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Hashing 1 Hashing. Hashing 2 Hashing … * Again, a (dynamic) set of elements in which we do ‘search’, ‘insert’, and ‘delete’ n Linear ones: lists, stacks,
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
Chapter 5 Linked List by Before you learn Linked List 3 rd level of Data Structures Intermediate Level of Understanding for C++ Please.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Mergeable Heaps David Kauchak cs302 Spring Admin Homework 7?
Heaps © 2010 Goodrich, Tamassia. Heaps2 Priority Queue ADT  A priority queue (PQ) stores a collection of entries  Typically, an entry is a.
1 Chapter 6 Heapsort. 2 About this lecture Introduce Heap – Shape Property and Heap Property – Heap Operations Heapsort: Use Heap to Sort Fixing heap.
1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)
CHAPTER 51 LINKED LISTS. Introduction link list is a linear array collection of data elements called nodes, where the linear order is given by means of.
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
UNC Chapel Hill M. C. Lin Geometric Data Structures Reading: Chapter 10 of the Textbook Driving Applications –Windowing Queries Related Application –Query.
Lecture 3: Parallel Algorithm Design
Priority Queues © 2010 Goodrich, Tamassia Priority Queues 1
Lecture 22 Binary Search Trees Chapter 10 of textbook
Heaps © 2010 Goodrich, Tamassia Heaps Heaps
CMSC 341 Lecture 13 Leftist Heaps
Computational Geometry Capter:1-2.1
Convex Sets & Concave Sets
A Robust Data Structure
CMPS 3120: Computational Geometry Spring 2013
CSC 143 Binary Search Trees.
B-Trees.
Heaps & Multi-way Search Trees
Presentation transcript:

Certification of Computational Results Greg Bronevetsky

Background Technique proposed by Gregory F. Sullivan Dwight S. Wilson Gerald B. Masson All from Johns Hopkins CS Department.

Overview Trying to do fault detection without the severe overhead of replication. Certification Trails are a manual approach that has that programmer provide additional code to have the program check itself. A program generates a certification trail that details its work. A checker program can use this trail to verify that the output is correct in asymptotically less time. Several examples provided. No automation.

Roadmap We will cover some algorithms to which the Certification Trails technique has been applied Sorting Convex Hull Heap Data Structures The addition of Certification Trails and the creation of the Checker is done manually by the programmer in all cases.

Trail for Sorting In order to verify the output of a sorting algorithm we must check that The sorted items are a permutation of the original input items. The sorted items appear in a non-decreasing order in the sorter's output. Thus, the trail should contain all the items in their original order, each labeled with its location in the sorted list.

Sorting Checker A Sorting Checker must: Use the labels to place all elements into their sorted spots and verify that this results in a non- decreasing order. Verify that no two elements are placed in the same location in the ordered list. The Sorter takes O(n 2 ) or O(n log n) time. The Checker takes O(n) time. Checker is asymptotically faster than Sorter.

Convex Hull Problem Given a set of points on a 2D plane, find a subset of points that forms a convex hull around all the points.

Convex Hull: Step 1 P 1 is the point with the least x- coordinate. P6P6 P2P2 P8P8 P3P3 P5P5 P1P1 P7P7 P4P4 Points sorted in order of increasing slope relative to P 1

Convex Hull: Invariant P6P6 P2P2 P8P8 P3P3 P5P5 P1P1 P7P7 P4P4 All the points not on the Hull are inside a triangle formed by P 1 and two successive points on the Hull.

Convex Hull: Invariant P6P6 P2P2 P8P8 P3P3 P5P5 P1P1 P7P7 P4P4 We know that P 3 is not a Hull point because the clockwise angle between lines and is ≥ 180º. ≥ 180º

Convex Hull: Invariant P6P6 P2P2 P8P8 P3P3 P5P5 P1P1 P7P7 P4P4 < 180º Note that if clockwise angle between lines and is < 180º, then P 3 is a Hull point

Convex Hull Algorithm Add P 1, P 2 and P 3 to the Hull. (Note: P 1, P 2 and P n must be on the Hull.) For P k = P 4 to P n... trying to add P k to the Hull... Let Q A and Q B be the two points most recently added to the Hull: While the angle formed by Q A, Q B and P k ≥180 remove Q B from the Hull since it is inside the triangle: P 1, Q A, P k. Add P k to the Hull.

Trail for Convex Hull Augment Program to Output {q 1, q 2,..., q m } = the indexes of the points on the hull. Output a proof of correctness for {x 1, x 2,..., x r } = all points not on the Hull in the form of the triangle that contains it.

Convex Hull Checker Checker must check that: There is a 1-1 correspondence between input points and {q 1, q 2,..., q m } U {x 1, x 2,..., x r }. All points in the triangle proofs correspond to input points. Each point in in the triangle proofs actually lies in the given triangle. Every triple of supposed Hull points forms a convex angle. There is a unique locally maximal point on the hull.

Asymptotic Runtimes Original Convex Hull Algorithm takes O(n log n) time to sort and the Hull construction loop takes only O(n) time. O(n log n)-time total. Convex Hull Checker runs thru the set of points once for each check. O(n)-time total. Checker asymptotically faster than Original.

Certification Trails for Data Structures Lets have a data structure for storing value/key pairs, ordered lexicographically: (key, val) < (key', val') iff val<val' or (val=val' and key<key') Operations: member(key): returns whether key is mapped to some val. insert(key, val): inserts a pair (key, val) into the data structure. delete(key): deletes the pair that contains key.

Data Structure Specs Data Structure Operations changekey(key, newval): executed when the pair (key, oldval) exists in the data structure. Removes this pair and inserts the pair (key, newval) deletemin(): deletes the smallest pair (according to the ordering). Returns “empty” if the data structure contains no pairs. predecessor(key): returns the key of the pair that immediately precedes key's pair or “smallest” if there is no such pair. empty(): returns whether the data structure is empty.

Data Structure Implementation Such a Data Structure can be implemented via an AVL tree, a red-black tree or a b-tree. Most operations will take O(log n) time. We can augement implementations to generate a certification trail: insert(key, val): output the key of the predecessor of the newly inserted pair (key, val). If there is no predecessor, output “smallest”. changekey(key, newval): output predecessor of the new pair (key, newval). If there is no predecessor, output “smallest”.

Data Structure Checker A Checker for any program using the above data structure can use the certification trail to implement a much faster data structure. All operations can be done in O(1) time. Resulting program will be faster than original program. Maybe asymptotically faster.

Optimized Data Structure A doubly linked list of (key, val) pairs, sorted according to the pair ordering relation. An array indexed by keys, containing pointers to (key, val) pairs corresponding to the indexes. The first pair (with key=0) contains value=sm, which is defined to be smaller than any other possible value.

Optimized Data Structure Optimized data structure operations: insert(key, val): Read from trail prec_key = the key of the pair preceding the new (key, val) pair. Check that it is a valid index. Look at the pair pointed to by array[prec_key]. Verify that it is ≠null. Place the (key, val) pair at index key, following the (prec_key, prec_val) pair. Check that before the insert() array[key] was =null. Ensure that (key, val) is greater than its predecessor and less than its successor.

Optimized Insert Example Result of the call insert(5, 62)

Optimized Data Structure Optimized data structure operations: delete(key): Remove the pair pointed to by array[key]. Ensure that array[key]≠null. changekey(key, newval): Call delete(key), followed by insert(key, newval). These calls will check all necessary conditions. deletemin(): Look at the pair that follows the pair (0,sm) (pointed to by array[0]). If no such pair, return “empty”. Else, if there exists pair (key, val), then remove it and set array[key] to null. empty(): Return whether there is a pair following the pair (0,sm).

Optimized Data Structure Optimized data structure operations: member(key): return whether array[key]=null. predecessor(key): Look at the pair pointed to by array[key]. Follow its backward link to its predecessor pair. If the predecessor pair is (0,sm) then return “smallest”. Else, return the key field of that pair. Note that all the operations can be done in O(1) time.

Shortest Path A Shortest Path algorithm was implemented using the above algorithm. The original program used the original data structure that produced a certification trail. The checker version was identical to the original except that its data structure was the optimized version that used the trail. Original runtime = O(mlog n) Checker runtime = O(m) (m=number of edges, n=number of nodes)

Performance: Sort Basic Algorithm – Sorting algorithm with no certification trails. 1 st Execution – Sorter that produces certification trail. 2 nd Execution – Checking algorithm that uses the trail. Speedup – factor of improvement of 2 nd vs Basic. %Savings – of 1 st + 2 nd trails execution over running Basic twice.

Performance: Sort

Performance: Convex Hull

Performance: Shortest Path

Summary of Experiments The overhead of generating a certification trail is about 2%. The checker run is much faster than the original. It can be run on much slower hardware or use a formally verified language.

Application to Byzantine Failures Current technique is completely manual. No known way to automatically convert a program to generate a trail. We may develop libraries that use the Certification Trails technique, allowing us to catch errors in a large fraction of a program. Door open to Failure Recovery: when an error is detected the checker goes back to using original code to redo the work.