Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.

Slides:



Advertisements
Similar presentations
Symmetric Multiprocessors: Synchronization and Sequential Consistency.
Advertisements

Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
1 Chapter 4 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Process Synchronization Continued 7.2 The Critical-Section Problem.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Silberschatz, Galvin and Gagne ©2007 Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Chapter 6 (a): Synchronization.
Chapter 6 Process Synchronization Bernard Chen Spring 2007.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Stacks, Queues, and Deques. 2 A stack is a last in, first out (LIFO) data structure Items are removed from a stack in the reverse order from the way they.
Stacks  a data structure which stores data in a Last-in First-out manner (LIFO)  has a pointer called TOP  can be implemented by either Array or Linked.
ADT Stacks and Queues. Stack: Logical Level “An ordered group of homogeneous items or elements in which items are added and removed from only one end.”
Data Structure Dr. Mohamed Khafagy.
Scalable Synchronous Queues By William N. Scherer III, Doug Lea, and Michael L. Scott Presented by Ran Isenberg.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
Multi-Object Synchronization. Main Points Problems with synchronizing multiple objects Definition of deadlock – Circular waiting for resources Conditions.
Avishai Wool lecture Introduction to Systems Programming Lecture 4 Inter-Process / Inter-Thread Communication.
6: Process Synchronization 1 1 PROCESS SYNCHRONIZATION I This is about getting processes to coordinate with each other. How do processes work with resources.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
1 Concurrency: Deadlock and Starvation Chapter 6.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Stacks, Queues, and Deques
ADT Stacks and Queues. Stack: Logical Level “An ordered group of homogeneous items or elements in which items are added and removed from only one end.”
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Chapter 6: Process Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 8, 2005 Background Concurrent.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
Adapted from instructor resources Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education, Inc. All rights.
Computer Science Department Data Structure & Algorithms Lecture 8 Recursion.
6.3 Peterson’s Solution The two processes share two variables: Int turn; Boolean flag[2] The variable turn indicates whose turn it is to enter the critical.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Maged M.Michael Michael L.Scott Department of Computer Science Univeristy of Rochester Presented by: Jun Miao.
Chap 6 Synchronization. Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Chapter 6: Process Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Module 6: Process Synchronization Background The.
Synchronicity Introduction to Operating Systems: Module 5.
Operating Systems CSE 411 CPU Management Dec Lecture Instructor: Bhuvan Urgaonkar.
Distributed Algorithms (22903) Lecturer: Danny Hendler The wait-free hierarchy and the universality of consensus This presentation is based on the book.
Scalable lock-free Stack Algorithm Wael Yehia York University February 8, 2010.
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
CE Operating Systems Lecture 8 Process Scheduling continued and an introduction to process synchronisation.
Homework-6 Questions : 2,10,15,22.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Review Array Array Elements Accessing array elements
Data Structures Using C, 2e
Queues.
Chapter 6: Process Synchronization
Process Synchronization: Semaphores
Background on the need for Synchronization
Stacks and Queues.
Queues Queues Queues.
CMSC 341 Lecture 5 Stacks, Queues
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lesson Objectives Aims
Symmetric Multiprocessors: Synchronization and Sequential Consistency
CS510 - Portland State University
Semaphore Originally called P() and V() wait (S) { while S <= 0
Module 7a: Classic Synchronization
CS210- Lecture 5 Jun 9, 2005 Agenda Queues
Process/Thread Synchronization (Part 2)
Presentation transcript:

Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09

Introduction to the Introduction Some problems with locks: –Deadlock: The condition where there are at least two processes A and B such that A holds a lock on a resource required by B to complete and B holds a lock on a resource required by A to complete and they wait indefinitely. –In general Deadlock avoidance is hard if not NP-Complete.

(Some) Problems with Locks Priority Inversion: A low-priority process holds a lock on a resource desired by a high priority process. Not very granular, a small percentage of the operations performed in the critical section may actually modify shared memory, hurting performance

If not locks than what? Without hardware support Lock-free algorithms are possible, but not really practical (exceptions do exist: see Lamport’s Bakery Algorithm) Hardware support can be generalized as atomic operations.

What is Lock-free? (Definitions) Atomic: From the Greek ἄτομος (atomos) meaning indivisible or uncuttable. Atomic Operations: A set of operations that execute as if they were a single simple operation.

What is Lock-free? (Definitions) Lock-free: Algorithms/Data Structures that can be invoked or accessed in a parallel context accessing shared memory without a mechanism to protect their critical section (such as a mutex) Critical Section: A set of instructions necessary to complete a single complete operation on a shared memory resource. (Ex: Pop operation on a stack)

Compare and Swap (CAS) For integers n and n' and a memory location a CAS( n, a, n' ) if the value at address a is n write the value of n' to address a return true otherwise return false

Compare and Swap (CAS) Executes atomically, often the values of n and n' represent memory addresses Supported in some form as far back as the IBM 370 and available on almost all modern general purpose microprocessors (Pentium, Power PC, etc)

Compare and Swap (CAS) Often used to implement 'busy waiting’... success <- false Do success <- CAS( n, a, n') Until success...

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Let Top be address of the top of the stack Let Next(I) where I is a valid address of a stack item be the address where the address of the next item after I is stored Let New be the address of the item for the stack

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Head

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top We create a new item 7

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top Read the value of Top 7 T’

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top Set the new item’s Next to T’ 7 T’

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top Suppose another thread adds an item before the assignment and the CAS 7 T’ 3

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top T’ != Top so the CAS will fail – we must start again 7 T’ 3

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top 7 T’ 3

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top 7 T’ 3

Compare and Swap (CAS) Example: Pushing an item onto a lock-free stack Push ( New )Success <- false Do T' <- Top Next(New) <- T’ Success <- CAS( T', Top, New ) Until Success 53 Top 7 T’ 3 This time the CAS succeeds and we are done

ABA problem Where the value read(and stored) by a process (A) is changed by another process (B) to the same value as read by A, but causes some other state change that causes A to produce an incorrect result

ABA problem Consider the stack and two threads A and B and the following (incorrect) pop operation: Let Value(I), where I is a valid address of a stack item, be the value associated with I Pop () Do T' <- Top’ N' <- Next(T') V <- Value(T') Success <- CAS( T', Top, N' ) Until Success Return V

ABA problem AB 53 Top 72

ABA problem A Do T' <- Top N' <- Next(T’) V <- Value(T') B 53 Top 72 V = 7 T’ N’

ABA problem A Do T' <- Top’ N' <- Next(T’) V <- Value(T') B Pop() 53 2 V = 7 T’ N’ 7 is returned Top

ABA problem A Do T' <- Top’ N' <- Next(T’) V <- Value(T') B Pop() Push( 42 ) 53 2 V = 7 T’ 42 The new item is created in the old memory location N’ Top

ABA problem A Do T' <- Top’ N' <- Next(T’) V <- Value(T') Success <- CAS( T', Top, N’ ) B Pop() Push( 42 ) 53 2 V = 7 T’ N’ Top 42 The CAS will Succeed

ABA problem A Do T' <- Top’ N' <- Next(T’) V <- Value(T') Success <- CAS( T', Top, N’ ) Until Success Return V B Pop() Push( 42 ) 53 2 V = 7 T’ N’ Top 42 7 will be returned

ABA problem A Do T' <- Top’ N' <- Next(T’) V <- Value(T') Success <- CAS( T', Top, N’ ) Until Success Return V B Pop() Push( 42 ) 53 2 V = 7 T’ N’ Top 42 7 will be returned - again.

What is Wait-free? Wait free: Algorithms/Data Structures that can be invoked or accessed in a parallel context accessing shared memory such that execution time is guaranteed and predictable. Almost always lock-free. Not all lock-free constructions are wait-free (most use some form of 'busy waiting' which is inherently unpredictable)

Multi-Producer/Consumer Circular Queues (The Problem) Circular queue: An array of fixed size, to be accessed in a FIFO manner Multi-producer: More than one process may write to the queue at once Multi-consumer: More than one process may read from the queue at once

Multi-Producer/Consumer Circular Queues (The Problem) The advantages of lock free: –After a cell in the array has been 'reserved' for a particular process - writing or reading will cause no contention. –Only maintenance of queue parameters need made safe

Multi-Producer/Consumer Circular Queues (The Algorithm) The advantages of lock free: –After a cell in the array has been 'reserved' for a particular process - writing or reading will cause no contention. –Only maintenance of queue parameters need made safe

Multi-Producer/Consumer Circular Queues (The Algorithm) Sequential Version: Let A(i) be the element of the array at position i (starting at 0) Let Size be the number of cells in the array Let Head be the index of the first free cell for writing Let Tail be the index of the first occupied cell for writing Let Write(i) represent the work to write the data in question to array position i

Multi-Producer/Consumer Circular Queues (The Algorithm) Sequential Version: Enqueue() If not head = tail Write( Head ) Head <- ( Head + 1 ) mod Size else Fail

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue()

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue()

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue() Write( Head )

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue() Write( Head )

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue() Write( Head ) Head<-( Head + 1 ) mod Size

Multi-Producer/Consumer Circular Queues (The Algorithm) Obvious not correct in a parallel context Consider: A B Enqueue() Write( Head ) Head<-( Head + 1 ) mod Size Values will be written to the same place

Multi-Producer/Consumer Circular Queues (The Algorithm) Need additional information: Let NumWriters and NumReaders be the maximum amount of writers and readers allowed in the queue respectively Initially NumWriters = Size and NumReaders = 0

Multi-Producer/Consumer Circular Queues (The Algorithm) First solution: Advance the head pointer atomically Let *Head be the memory address of the value Head Let *NR be the address of NumReaders Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Increase NumReaders

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Head NumReaders = 0

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Head NumReaders = 0

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Head NumReaders =

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Thread B succeeds in advancing Head pointer Head NumReaders =

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Thread B succeeds in advancing Head pointer Thread B begins writing Head NumReaders =

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Thread B succeeds in advancing Head pointer Thread B begins writing Thread B finishes writing Head NumReaders =

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Thread B succeeds in advancing Head pointer Thread B begins writing Thread B finishes writing Thread B succeeds in increasing NumReaders Head NumReaders =

Multi-Producer/Consumer Circular Queues (The Algorithm) Just when you thought is was (thread) safe: Consider the following series of events: Queue is Empty Thread A succeeds in advancing Head pointer Thread A begins writing Thread B succeeds in advancing Head pointer Thread B begins writing Thread B finishes writing Thread B succeeds in increasing NumReaders Head NumReaders = This can now be read – before writing has finished!

Multi-Producer/Consumer Circular Queues (The Algorithm) A 'busy waiting' solution Let WriteMarker be equal to Head when Queue is created Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head We begin as another thread is already writing to the queue H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Copy the head index H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Advance head (assume we succeed) H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Write to the buffer H’ WriteMarker 97234

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Write to the buffer H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Write to the buffer H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Keeping looping until the other thread advances WriteMarker H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Keeping looping until the other thread advances WriteMarker H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Keeping looping until the other thread advances WriteMarker H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Keeping looping until the other thread advances WriteMarker H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Now we advance WriteMarker ourselves. (Why don’t we need to CAS this?) H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *Head, (H' + 1) mod Size ) Until Success Write( H' ) Do W <- WriteMarker Until W = H’ WriteMarker <- ( WriteMarker + 1 ) mod Size Increase NumReaders Head Finally we increase NumReaders now that the data is safe to be read H’ WriteMarker

Multi-Producer/Consumer Circular Queues (The Algorithm) Everything works better when we all cooperate Let Done(i) be a boolean variable associated with each cell in the Array, Initially false Let *WM be the address of WriteMarkers Enqueue() ( NumWriters is decremented ) Success <- false Do H' <- Head Success <- CAS( H', *H, (H' + 1) mod Size ) Until Success Write( H' )

Multi-Producer/Consumer Circular Queues (The Algorithm) Everything works better when we all cooperate (con’t) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse The story so far:

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse The story so far: There are two cells currently being written

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse The story so far: This is ‘us’

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing continues

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing continues

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing continues

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing continues

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing continues

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false truefalse Writing finishes

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Both threads update Done

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Both threads try to move the write pointer

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true We succeed. This means that the other thread has failed. It will finish knowing that we will ‘clean things up’

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Update the Done flag

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Let another reader in – now that we know it’s safe.

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Repeat until all everything is in a proper state or somebody else takes over.

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false true Repeat until all everything is in a proper state or somebody else takes over.

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false Repeat until all everything is in a proper state or somebody else takes over.

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false Repeat until all everything is in a proper state or somebody else takes over.

Multi-Producer/Consumer Circular Queues (The Algorithm) Done(H') <- true Do Success <- false If not WriteMarker = Head and Done( WriteMarker ) W <- WriteMarker Success <- CAS( W, *WM, (W+1) mod Size ) if Success Done( W ) <- false Increase NumReaders While Success Head WriteMarker Done: false Exit with the knowledge of a job well done

Multi-Producer/Consumer Circular Queues (Proof - Sketch) Everything works better when we all cooperate Criteria for correctness: –1) At all times: ( WriteMarker - Tail ) mod Size >= NumReaders –2) Enqueue always terminates –3) At any point in execution if no Enqueue operation is in progress then WriteMarker = Head –4) Once an item is enqueued in cell i no other enqueue operation will write to i until it is dequeued (follows from 1 and case analysis)

Some Performance Results