Non-Blocking Concurrent Data Objects With Abstract Concurrency By Jack Pribble Based on, “A Methodology for Implementing Highly Concurrent Data Objects,”

Slides:



Advertisements
Similar presentations
Software Transactional Objects Guy Eddon Maurice Herlihy TRAMP 2007.
Advertisements

Dynamic Allocation and Linked Lists. Dynamic memory allocation in C C uses the functions malloc() and free() to implement dynamic allocation. malloc is.
Wait-Free Linked-Lists Shahar Timnat, Anastasia Braginsky, Alex Kogan, Erez Petrank Technion, Israel Presented by Shahar Timnat 469-+
Tutorial 8 March 9, 2012 TA: Europa Shang
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Threads Cannot be Implemented As a Library Andrew Hobbs.
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
More on Semaphores, and Classic Synchronization Problems CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Module R2 Overview. Process queues As processes enter the system and transition from state to state, they are stored queues. There may be many different.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Concurrency 101 Shared state. Part 1: General Concepts 2.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
Overview of Previous Lesson(s) Over View 3  Debugger  A computer program that is used to test and debug other programs.  Local Debugging  Debugging.
6/10/2015C++ for Java Programmers1 Pointers and References Timothy Budd.
Computer Laboratory Practical non-blocking data structures Tim Harris Computer Laboratory.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
CS510 Advanced OS Seminar Class 10 A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy.
Encapsulation by Subprograms and Type Definitions
What is RCU, fundamentally? Sri Ramkrishna. Intro RCU stands for Read Copy Update  A synchronization method that allows reads to occur concurrently with.
Language Support for Lightweight transactions Tim Harris & Keir Fraser Presented by Narayanan Sundaram 04/28/2008.
Spring 2004 ECE569 Lecture ECE 569 Database System Engineering Spring 2004 Yanyong Zhang
CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.
Peter Juszczyk CS 492/493 - ISGS. // Is this C# or Java? class TestApp { static void Main() { int counter = 0; counter++; } } The answer is C# - In C#
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
UPC Runtime Layer Jason Duell. The Big Picture The Runtime layer handles everything that is both: 1) Platform/Environment specific —So compiler can output.
1 Using Classes Object-Oriented Programming Using C++ Second Edition 5.
1 Lock-Free Linked Lists Using Compare-and-Swap by John Valois Speaker’s Name: Talk Title: Larry Bush.
CS510 Concurrent Systems Introduction to Concurrency.
Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department © Philippas Tsigas.
CS 11 C track: lecture 5 Last week: pointers This week: Pointer arithmetic Arrays and pointers Dynamic memory allocation The stack and the heap.
Experience with Processes and Monitors in Mesa
Software Transactional Memory for Dynamic-Sized Data Structures Maurice Herlihy, Victor Luchangco, Mark Moir, William Scherer Presented by: Gokul Soundararajan.
CHAPTER 2: COMPUTER-SYSTEM STRUCTURES Computer system operation Computer system operation I/O structure I/O structure Storage structure Storage structure.
Concurrency, Mutual Exclusion and Synchronization.
Chapter 3: Processes. 3.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts - 7 th Edition, Feb 7, 2006 Process Concept Process – a program.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. C H A P T E R F I V E Memory Management.
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 5: Process Synchronization.
C# Classes and Inheritance CNS 3260 C#.NET Software Development.
CPS4200 Unix Systems Programming Chapter 2. Programs, Processes and Threads A program is a prepared sequence of instructions to accomplish a defined task.
Threads Cannot be Implemented as a Library Hans-J. Boehm.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Threads cannot be implemented as a library Hans-J. Boehm (presented by Max W Schwarz)
CS510 Concurrent Systems Jonathan Walpole. A Methodology for Implementing Highly Concurrent Data Objects.
Software Transactional Memory Should Not Be Obstruction-Free Robert Ennals Presented by Abdulai Sei.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
How D can make concurrent programming a piece of cake Bartosz Milewski D Programming Language.
A Methodology for Implementing Highly Concurrent Data Objects by Maurice Herlihy Slides by Vincent Rayappa.
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
Copyright 2014 – Noah Mendelsohn Code Tuning Noah Mendelsohn Tufts University Web:
ICOM 4035 – Data Structures Dr. Manuel Rodríguez Martínez Electrical and Computer Engineering Department Lecture 2 – August 23, 2001.
Barriers and Condition Variables
CS510 Concurrent Systems Tyler Fetters. A Methodology for Implementing Highly Concurrent Data Objects.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
R Some of these slides are from Prof Frank Lin SJSU. r Minor modifications are made. 1.
DYNAMIC MEMORY ALLOCATION. Disadvantages of ARRAYS MEMORY ALLOCATION OF ARRAY IS STATIC: Less resource utilization. For example: If the maximum elements.
Auburn University COMP 3500 Introduction to Operating Systems Synchronization: Part 4 Classical Synchronization Problems.
Threads Cannot Be Implemented As a Library
ICS143A: Principles of Operating Systems Lecture 15: Locking
Lecture 8 Thread Safety.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
A Methodology for Implementing Highly Concurrent Data Objects
Kernel Synchronization II
Symmetric Multiprocessors: Synchronization and Sequential Consistency
CSE 153 Design of Operating Systems Winter 19
SPL – PS3 C++ Classes.
Presentation transcript:

Non-Blocking Concurrent Data Objects With Abstract Concurrency By Jack Pribble Based on, “A Methodology for Implementing Highly Concurrent Data Objects,” by Maurice Herlihy

Concurrent Object A data structure shared by concurrent processes Traditionally implemented using critical sections (locks) In asynchronous systems slow or halted processes can impede the progress of fast processes Alternatives to critical sections include non-blocking implementations and the stronger wait free implementations

Non-Blocking Concurrency Non-blocking: after a finite number of steps at least one process must complete Wait-free: every process must complete after a finite number of steps A system that is merely non-blocking is prone to starvation and thus should only be used when starvation is unlikely. A wait-free system protects against starvation, so it should be used when some processes run slower than others.

Methodology for Constructing Concurrent Objects Data objects are implemented in a sequential fashion with with certain conventions adhered to, but with no explicit synchronization. The sequential implementation cannot modify memory other than the concurrent object, and it must always leave the object in a legal state. The sequential code is automatically transformed into concurrent code through synchronization and memory management techniques. The transformation is simple enough for a compiler or preprocessor to handle.

Basic Concurrency Transformation Each object holds a pointer to the current version of the object Each Process: 1)Reads the pointer using load-linked 2)Copies the indicated version of the object to a block of memory 3)Applies the sequential operation to the copy of the object 4)Uses store-conditional to swing the pointer from the old to the new version If 4 fails then the the process restarts at 1.

To ensure that a process is not accessing an incomplete state while another process updates the shared object, two version counters are used (check[0] and check[1]). When a process modifies an object it updates check[0], does the modification, then updates check[1]. When a process copies an object it reads check[1], copies the version, then reads check[0]. The copy will only succeed if the modifying process has completed all modifications on the object, thus the object won't be left in an incomplete state.

Typedef struct { pqueue_type version; unsigned check[2]; }Pqueue_type; static Pqueue_type *new_pqueue; int Pqueue_deq(Pqueue_type **Q) Pqueue_type old_pqueue; /* concurrent object */ pqueue_type *old_version, *new_version; /* sequential object */ int result; unsinged first, last; while(1) { old_pqueue = load_linked(Q); old_version = &old_pqueue - > version; new_version = &new_pqueue - > version; first = old_pqueue - > check[1]; copy(old_version, new_version); last = old_pqueue - > check[0]; if (first == last) { result = pqueue_deq(new_version); if (store_conditional(Q, new_version)) break; } } new_pqueue = old_pqueue; return result; }

Performance of a Simple Non-Blocking Implementation vs. a Simple Spin-Lock

Performance with Backoff vs. Simple Spin-Lock and Spin- Lock with Backoff

Wait-Free Implementation Each process has an invocation structure updated when beginning an operation and a response structure updated when completing an operation. Invocation Structure: operation name, the argument value, and a toggle bit to determine if the invocation is old or new Response Structure: result value, and toggle bit

The concurrent object contains an array field called responses. Responses records the result of the most recently completed operation of process P at responses[P]. Processes share an array called announce. Process P records its argument and operation name at announce[P] when starting a new operation, and it also compliments the toggle bit.

Performance of Non-Blocking with Backoff vs. Wait-Free with Backoff

Large Concurrent Objects Cannot be copied in a single block Represented by a set of blocks that are linked by pointers The programmer is responsible for determining which blocks of the object are necessary to copy The less that is copied the better the code will perform that interacts with the object