Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler

Slides:



Advertisements
Similar presentations
Mutual Exclusion – SW & HW By Oded Regev. Outline: Short review on the Bakery algorithm Short review on the Bakery algorithm Black & White Algorithm Black.
Advertisements

CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
A Completeness theorem for a class of synchronization objects Afek.Y, Weisberger.E, Weisman.H Presented by: Reut Schwartz.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Process Synchronization Continued 7.2 The Critical-Section Problem.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler The Mutual Exclusion problem.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
CPSC 668Set 18: Wait-Free Simulations Beyond Registers1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
1 Course Syllabus 1. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation.
Multiprocess Synchronization Algorithms ( )
1 Course Syllabus 1. Introduction - History; Views; Concepts; Structure 2. Process Management - Processes; State + Resources; Threads; Unix implementation.
Introduction to Lock-free Data-structures and algorithms Micah J Best May 14/09.
Chapter 6: Process Synchronization. Outline Background Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores Classic Problems.
Distributed Algorithms (22903) Lecturer: Danny Hendler Shared objects: linearizability, wait-freedom and simulations Most of this presentation is based.
What Can Be Implemented Anonymously ? Paper by Rachid Guerraui and Eric Ruppert Presentation by Amir Anter 1.
CS510 Concurrent Systems Class 2 A Lock-Free Multiprocessor OS Kernel.
Synchronization (other solutions …). Announcements Assignment 2 is graded Project 1 is due today.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
1 Thread Synchronization: Too Much Milk. 2 Implementing Critical Sections in Software Hard The following example will demonstrate the difficulty of providing.
Parallel Programming Philippas Tsigas Chalmers University of Technology Computer Science and Engineering Department © Philippas Tsigas.
CS510 Concurrent Systems Jonathan Walpole. A Lock-Free Multiprocessor OS Kernel.
Concurrency, Mutual Exclusion and Synchronization.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 3 (26/01/2006) Instructor: Haifeng YU.
Transactional Memory Lecturer: Danny Hendler.  Speeding up uni-processors is harder and harder  Intel, Sun (RIP), AMD, IBM now focusing on “multi-core”
Process Synchronization Continued 7.2 Critical-Section Problem 7.3 Synchronization Hardware 7.4 Semaphores.
1 Chapter 9 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
6.852: Distributed Algorithms Spring, 2008 Class 13.
1 Chapter 10 Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2014 Synchronization Algorithms and Concurrent Programming Synchronization.
1 Consensus Hierarchy Part 1. 2 Consensus in Shared Memory Consider processors in shared memory: which try to solve the consensus problem.
11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 5: Process Synchronization.
Transactional Memory Lecturer: Danny Hendler. 2 2 From the New York Times…
Mutual Exclusion Using Atomic Registers Lecturer: Netanel Dahan Instructor: Prof. Yehuda Afek B.Sc. Seminar on Distributed Computation Tel-Aviv University.
Wait-Free Consensus CPSC 661 Fall 2003 Supervised by: Lisa Higham Presented by: Wei Wei Zheng Nuha Kamaluddeen.
Common2 extended to stacks and unbound concurrency By:Yehuda Afek Eli Gafni Adam Morrison May 2007 Presentor: Dima Liahovitsky 1.
A Methodology for Creating Fast Wait-Free Data Structures Alex Koganand Erez Petrank Computer Science Technion, Israel.
Wait-Free Multi-Word Compare- And-Swap using Greedy Helping and Grabbing Håkan Sundell PDPTA 2009.
Operating Systems CMPSC 473 Mutual Exclusion Lecture 11: October 5, 2010 Instructor: Bhuvan Urgaonkar.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Distributed Algorithms (22903) Lecturer: Danny Hendler The wait-free hierarchy and the universality of consensus This presentation is based on the book.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Distributed Algorithms (22903) Lecturer: Danny Hendler Lock-free stack algorithms.
Distributed Algorithms (22903)
Distributed Algorithms (22903)
Chapter 5: Process Synchronization
Distributed Algorithms (22903)
Challenges in Concurrent Computing
Distributed Algorithms (22903)
Distributed Algorithms (22903)
Course Syllabus 1. Introduction - History; Views; Concepts; Structure
Distributed Algorithms (22903)
Distributed Algorithms (22903)
Concurrency: Mutual Exclusion and Process Synchronization
Course Syllabus 1. Introduction - History; Views; Concepts; Structure
Multiprocessor Synchronization Algorithms ( )
Lecturer: Danny Hendler
Course Syllabus 1. Introduction - History; Views; Concepts; Structure
Chapter 6: Synchronization Tools
Course Syllabus 1. Introduction - History; Views; Concepts; Structure
Course Syllabus 1. Introduction - History; Views; Concepts; Structure
CSE 542: Operating Systems
Syllabus 1. Introduction - History; Views; Concepts; Structure
Presentation transcript:

Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler

2 Requirements Select a paper and notify me by March, 1’st Study the paper and prepare a good presentation Give the seminar talk Participate in at least 80% of seminar talks

33 Seminar Plan Introductory lecture #1 21/2/11 Paper assignment published Semester ends 28/2/11 Introductory lecture #2 Papers list published, students send their 3 preferences 3/3/11 Student talks 7/3/11 Student talks start

4 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

5 5 From the New York Times…

66 Moore’s law Exponential growth in computing power

7 7  Speeding up uni-processors is harder and harder  Intel, Sun, AMD, IBM now focusing on “multi- core” architectures  Already, most computers are multiprocessors How can we write correct and scalable algorithms for multiprocessors ? The Future of Computing

8 Race conditions - a fundamental problem of thread-level parallelism. Account[i] = Account[i]-X; Account[j] = Account[j]+X;.... Account[i] = Account[i]-X; Account[j] = Account[j]+X;... Thread A Thread B But what if execution is concurrent? Must avoid race conditions

9 Key synchronization alternatives  Mutual exclusion locks  Coarse-grained locks  Find-grained locks  Nonblocking synchronization  Transactional memory

10 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

11 The mutual exclusion problem (Dijkstra, 1965) We need to devise a protocol that guarantees mutually exclusive access by processes to a shared resource (such as a file, printer, etc.)

12 The problem model Shared-memory multiprocessor: multiple processes Processes can apply atomic reads, writes or stronger read-modify-write operations to shared variables Completely asynchronous

13 Mutual exclusion: algorithm structure loop forever Remainder code Entry code Critical section (CS) Exit code end loop Remainder code Entry code CS Exit code

14 Mutual exclusion: No two processes are at their CS at the same time. Deadlock-freedom: If a process is trying to enter its critical section, then some process eventually enters its critical section. Starvation-freedom (optional): If a process is trying to enter its critical section, then this process must eventually enter its critical section. Assumption: processes do not fail-stop while performing the entry, CS, or exit code. Mutual exclusion: formal definitions

15 Candidate algorithm Program for process 0 1.await turn=0 2.CS of process 0 3.turn:=1 Program for process 1 1.await turn=1 2.CS of process 1 3.turn:=0 initially: turn=0 Does algorithm1 satisfy mutex? Does it satisfy deadlock-freedom? Yes No

16 Peterson’s 2-process algorithm (Peterson, 1981) initially: b[0]=false, b[1]=false, turn=0 or 1 Program for process 0 1.b[0]:=true 2.turn:=0 3.await (b[1]=false or turn=1) 4.CS 5.b[0]:=false Program for process 1 1.b[1]:=true 2.turn:=1 3.await (b[0]=false or turn=0) 4.CS 5.b[1]:=false

17 Mutual exclusion for n processes: Tournament trees Level 0 Level 1 Level 2 Processes A tree-node is identified by: [level, node#] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006

18

19

20 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

21 Synchronization alternatives Non-blocking synchronization  Various progress guarantees  Wait-freedom  Lock-freedom  Obstruction-freedom  Generally requires strong synchronization Pros Potentially scalable Avoids lock hazards Cons Typically complicated to program

Read-modify-write operations 22 The use of locks can sometimes be avoided if the hardware supports stronger Read-Modify-Write operations and not just read and write:  Test-and-set  Fetch-and-add  Compare-and-swap Test-and-set(w) atomically v read from w w 1 return v; Fetch-and-add(w, delta) atomically v read from w w v+delta return v;

The compare-and-swap (CAS) operation 23 Comare&swap(w, expected, new) atomically v read from w if (v = expected) { w new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86, Pentium

An example CAS usage: Treiber’s stack algorithm 24 Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od Top val next val next … val next

Treiber’s stack algorithm (cont’d) Top val next val next … val next Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od 25

Nonblocking Progress Conditions  Wait-freedom: Each thread terminates its operation in a finite number of its steps  Lock-freedom: After a finite number of steps, some thread terminates its operation  Obstruction-freedom: If a thread runs by itself long enough, it will finish its operation 26

27 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

28 Shared objects

29 Shared objects and implementations Each object has a state –Usually given by a set of shared memory fields Objects may be implemented from simpler base objects. Each object supports a set of operations –Only way to manipulate state –E.g. – a shared stack supports the push and pop operation.

30 time q.enq(x) q.enq(y) q.deq(x) q.deq(y) Enqueue Dequeue Enqueue Dequeue time Executions induce partial order on operations There is only a partial order between operations InvocationResponse

Correctness condition: linearizability 31 Linearizability: (an intuitive definition) Can find a point within the time-interval of each operation, where the operation took place, such that the operations order is legal.

32 Example: A queue time q.enq(x) q.enq(y)q.deq(x) q.deq(y) linearizable q.enq(x) q.enq(y)q.deq(x) q.deq(y) time (6)

33 Example time q.enq(x) q.enq(y) q.deq(y) not linearizable q.enq(x) q.enq(y) (5)

34 Example time q.enq(x) q.deq(x) q.enq(x) q.deq(x) linearizable time (4)

35 Example time q.enq(x) q.enq(y) q.deq(y) linearizable q.deq(x) time q.enq(x) q.enq(y) q.deq(y) q.deq(x) q.enq(x) q.enq(y) q.deq(y) q.deq(x) multiple orders OK (8)

Lin. points in Treiber’s algorithm 36 Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od

37 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

38 Transactional Memory A transaction is a sequence of memory reads and writes, executed by a single thread, that either commits or aborts If a transaction commits, all the reads and writes appear to have executed atomically If a transaction aborts, none of its stores take effect Transaction operations aren't visible until they commit (if they do)

39 Transactional Memory Goals A new multiprocessor architecture The goal: Implementing lock-free synchronization that is – efficient – easy to use compared with conventional techniques based on mutual exclusion Implemented by hardware support (such as straightforward extensions to multiprocessor cache- coherence protocols) and / or by software mechanisms

40 A Usage Example Locks: Lock(L[i]); Lock(L[j]); Account[i] = Account[i] – X; Account[j] = Account[j] + X; Unlock(L[j]); Unlock(L[i]); Transactional Memory: atomic { Account[i] = Account[i] – X; Account[j] = Account[j] + X; }; Account[i] = Account[i]-X; Account[j] = Account[j]+X;

41 Transactions execute in commit order ld 0xdddd... st 0xbeef Transaction A Time ld 0xbeef Transaction C ld 0xbeef Re-execute with new data Commit ld 0xdddd... ld 0xbbbb Transaction B Commit Violation! 0xbeef Taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given by Prof. Yehuda Afek. Transactions interaction

42  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus Talk outline

43

44

45

46 Formally: the ConsensusObject -Supports a single operation: decide -Each process p i calls decide with some input v i from some domain. decide returns a value from the same domain. -The following requirements must be met: - Agreement: In any execution E, all decide operations must return the same value. - Validity: The values returned by the operations must equal one of the inputs.

47 Wait-free consensus can be solved easily by compare&swap Comare&swap(b,old,new) atomically v read from b if (v = old) { b new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86

48 Would this consensus algorithm from reads/writes work? Decide(v) ; code for p i, i=0,1 1.if (decision = null) 2. decision=v 3. return v 4.else 5. return decision Initially decision=null

49 A proof that wait-free consensus for 2 or more processes cannot be solved by registers.

50 A FIFO queue Supports 2 operations: q.enqueue(x) – returns ack q.dequeue – returns the first item in the queue or empty if the queue is empty.

51 FIFO queue + registers can implement 2-process consensus Decide(v) ; code for p i, i=0,1 1.Prefer[i]:=v 2.qval=Q.deq() 3.if (qval = 0) then return v 4.else return Prefer[1-i] Initially Q= and Prefer[i]=null, i=0,1 There is no wait-free implementation of a FIFO queue shared by 2 or more processes from registers

52 A proof that wait-free consensus for 3 or more processes cannot be solved by FIFO queue (+ registers)

53 The wait-free hierarchy We say that object type X solves wait-free n-process consensus if there exists a wait-free consensus algorithm for n processes using only shared objects of type X and registers. The consensus number of object type X is n, denoted CN(X)=n, if n is the largest integer for which X solves wait-free n-process consensus. It is defined to be infinity if X solves consensus for every n. Lemma: If CN(X)=m and CN(Y)=n>m, then there is no wait- free implementation of Y from instances of X and registers in a system with more than m processes.

54 The wait-free hierarchy (cont’d)  Compare-and-swap … 2FIFO queue, stack, test-and-set 1registers

55 The universality of conensus An object is universal if, together with registers, it can implement any other object in a wait-free manner. It can be shown that any object X with consensus number n is universal in a system with n or less processes