Presentation is loading. Please wait.

Presentation is loading. Please wait.

Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler

Similar presentations


Presentation on theme: "Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler"— Presentation transcript:

1 Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler http://www.cs.bgu.ac.il/~satcc112

2 2 Requirements Select a paper and notify me by March, 1’st Study the paper and prepare a good presentation Give the seminar talk Participate in at least 80% of seminar talks

3 33 Seminar Plan Introductory lecture #1 21/2/11 Paper assignment published Semester ends 28/2/11 Introductory lecture #2 Papers list published, students send their 3 preferences 3/3/11 Student talks 7/3/11 Student talks start

4 4 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

5 5 5 From the New York Times…

6 66 Moore’s law Exponential growth in computing power

7 7 7  Speeding up uni-processors is harder and harder  Intel, Sun, AMD, IBM now focusing on “multi- core” architectures  Already, most computers are multiprocessors How can we write correct and scalable algorithms for multiprocessors ? The Future of Computing

8 8 Race conditions - a fundamental problem of thread-level parallelism. Account[i] = Account[i]-X; Account[j] = Account[j]+X;.... Account[i] = Account[i]-X; Account[j] = Account[j]+X;... Thread A Thread B But what if execution is concurrent? Must avoid race conditions

9 9 Key synchronization alternatives  Mutual exclusion locks  Coarse-grained locks  Find-grained locks  Nonblocking synchronization  Transactional memory

10 10 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

11 11 The mutual exclusion problem (Dijkstra, 1965) We need to devise a protocol that guarantees mutually exclusive access by processes to a shared resource (such as a file, printer, etc.)

12 12 The problem model Shared-memory multiprocessor: multiple processes Processes can apply atomic reads, writes or stronger read-modify-write operations to shared variables Completely asynchronous

13 13 Mutual exclusion: algorithm structure loop forever Remainder code Entry code Critical section (CS) Exit code end loop Remainder code Entry code CS Exit code

14 14 Mutual exclusion: No two processes are at their CS at the same time. Deadlock-freedom: If a process is trying to enter its critical section, then some process eventually enters its critical section. Starvation-freedom (optional): If a process is trying to enter its critical section, then this process must eventually enter its critical section. Assumption: processes do not fail-stop while performing the entry, CS, or exit code. Mutual exclusion: formal definitions

15 15 Candidate algorithm Program for process 0 1.await turn=0 2.CS of process 0 3.turn:=1 Program for process 1 1.await turn=1 2.CS of process 1 3.turn:=0 initially: turn=0 Does algorithm1 satisfy mutex? Does it satisfy deadlock-freedom? Yes No

16 16 Peterson’s 2-process algorithm (Peterson, 1981) initially: b[0]=false, b[1]=false, turn=0 or 1 Program for process 0 1.b[0]:=true 2.turn:=0 3.await (b[1]=false or turn=1) 4.CS 5.b[0]:=false Program for process 1 1.b[1]:=true 2.turn:=1 3.await (b[0]=false or turn=0) 4.CS 5.b[1]:=false

17 17 Mutual exclusion for n processes: Tournament trees 0 0 1 0 1 2 3 0 1 2 34 56 7 Level 0 Level 1 Level 2 Processes A tree-node is identified by: [level, node#] Synchronization Algorithms and Concurrent Programming Gadi Taubenfeld © 2006

18 18

19 19

20 20 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

21 21 Synchronization alternatives Non-blocking synchronization  Various progress guarantees  Wait-freedom  Lock-freedom  Obstruction-freedom  Generally requires strong synchronization Pros Potentially scalable Avoids lock hazards Cons Typically complicated to program

22 Read-modify-write operations 22 The use of locks can sometimes be avoided if the hardware supports stronger Read-Modify-Write operations and not just read and write:  Test-and-set  Fetch-and-add  Compare-and-swap Test-and-set(w) atomically v read from w w 1 return v; Fetch-and-add(w, delta) atomically v read from w w v+delta return v;

23 The compare-and-swap (CAS) operation 23 Comare&swap(w, expected, new) atomically v read from w if (v = expected) { w new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86, Pentium

24 An example CAS usage: Treiber’s stack algorithm 24 Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od Top val next val next … val next

25 Treiber’s stack algorithm (cont’d) Top val next val next … val next Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od 25

26 Nonblocking Progress Conditions  Wait-freedom: Each thread terminates its operation in a finite number of its steps  Lock-freedom: After a finite number of steps, some thread terminates its operation  Obstruction-freedom: If a thread runs by itself long enough, it will finish its operation 26

27 27 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

28 28 Shared objects

29 29 Shared objects and implementations Each object has a state –Usually given by a set of shared memory fields Objects may be implemented from simpler base objects. Each object supports a set of operations –Only way to manipulate state –E.g. – a shared stack supports the push and pop operation.

30 30 time q.enq(x) q.enq(y) q.deq(x) q.deq(y) Enqueue Dequeue Enqueue Dequeue time Executions induce partial order on operations There is only a partial order between operations InvocationResponse

31 Correctness condition: linearizability 31 Linearizability: (an intuitive definition) Can find a point within the time-interval of each operation, where the operation took place, such that the operations order is legal.

32 32 Example: A queue time q.enq(x) q.enq(y)q.deq(x) q.deq(y) linearizable q.enq(x) q.enq(y)q.deq(x) q.deq(y) time (6)

33 33 Example time q.enq(x) q.enq(y) q.deq(y) not linearizable q.enq(x) q.enq(y) (5)

34 34 Example time q.enq(x) q.deq(x) q.enq(x) q.deq(x) linearizable time (4)

35 35 Example time q.enq(x) q.enq(y) q.deq(y) linearizable q.deq(x) time q.enq(x) q.enq(y) q.deq(y) q.deq(x) q.enq(x) q.enq(y) q.deq(y) q.deq(x) multiple orders OK (8)

36 Lin. points in Treiber’s algorithm 36 Pop(Stack S) 1.do forever 2. top := S.top 3. if top = null 4. return empty 5. if compare&swap(S.top, top, top.next) 6. return-val=top.val 7. free top 8. return return-val 9.od Push(int v, Stack S) 1.n := new NODE ;create node for new stack item 2.n.val := v ;write item value 3.do forever ;repeat until success 4. node top := S.top 5. n.next := top ;next points to current top (LIFO order) 6. if compare&swap(S.top, top, n) ; try to add new item 7. return ; return if succeeded 8.od

37 37 Talk outline  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus

38 38 Transactional Memory A transaction is a sequence of memory reads and writes, executed by a single thread, that either commits or aborts If a transaction commits, all the reads and writes appear to have executed atomically If a transaction aborts, none of its stores take effect Transaction operations aren't visible until they commit (if they do)

39 39 Transactional Memory Goals A new multiprocessor architecture The goal: Implementing lock-free synchronization that is – efficient – easy to use compared with conventional techniques based on mutual exclusion Implemented by hardware support (such as straightforward extensions to multiprocessor cache- coherence protocols) and / or by software mechanisms

40 40 A Usage Example Locks: Lock(L[i]); Lock(L[j]); Account[i] = Account[i] – X; Account[j] = Account[j] + X; Unlock(L[j]); Unlock(L[i]); Transactional Memory: atomic { Account[i] = Account[i] – X; Account[j] = Account[j] + X; }; Account[i] = Account[i]-X; Account[j] = Account[j]+X;

41 41 Transactions execute in commit order ld 0xdddd... st 0xbeef Transaction A Time ld 0xbeef Transaction C ld 0xbeef Re-execute with new data Commit ld 0xdddd... ld 0xbbbb Transaction B Commit Violation! 0xbeef Taken from a presentation by Royi Maimon & Merav Havuv, prepared for a seminar given by Prof. Yehuda Afek. Transactions interaction

42 42  Motivation  Locks  Nonblocking synchronization  Shared objects and linearizability  Transactional memory  Consensus Talk outline

43 43

44 44

45 45

46 46 Formally: the ConsensusObject -Supports a single operation: decide -Each process p i calls decide with some input v i from some domain. decide returns a value from the same domain. -The following requirements must be met: - Agreement: In any execution E, all decide operations must return the same value. - Validity: The values returned by the operations must equal one of the inputs.

47 47 Wait-free consensus can be solved easily by compare&swap Comare&swap(b,old,new) atomically v read from b if (v = old) { b new return success } else return failure; MIPS PowerPC DECAlpha Motorola 680x0 IBM 370 Sun SPARC 80X86

48 48 Would this consensus algorithm from reads/writes work? Decide(v) ; code for p i, i=0,1 1.if (decision = null) 2. decision=v 3. return v 4.else 5. return decision Initially decision=null

49 49 A proof that wait-free consensus for 2 or more processes cannot be solved by registers.

50 50 A FIFO queue Supports 2 operations: q.enqueue(x) – returns ack q.dequeue – returns the first item in the queue or empty if the queue is empty.

51 51 FIFO queue + registers can implement 2-process consensus Decide(v) ; code for p i, i=0,1 1.Prefer[i]:=v 2.qval=Q.deq() 3.if (qval = 0) then return v 4.else return Prefer[1-i] Initially Q= and Prefer[i]=null, i=0,1 There is no wait-free implementation of a FIFO queue shared by 2 or more processes from registers

52 52 A proof that wait-free consensus for 3 or more processes cannot be solved by FIFO queue (+ registers)

53 53 The wait-free hierarchy We say that object type X solves wait-free n-process consensus if there exists a wait-free consensus algorithm for n processes using only shared objects of type X and registers. The consensus number of object type X is n, denoted CN(X)=n, if n is the largest integer for which X solves wait-free n-process consensus. It is defined to be infinity if X solves consensus for every n. Lemma: If CN(X)=m and CN(Y)=n>m, then there is no wait- free implementation of Y from instances of X and registers in a system with more than m processes.

54 54 The wait-free hierarchy (cont’d)  Compare-and-swap … 2FIFO queue, stack, test-and-set 1registers

55 55 The universality of conensus An object is universal if, together with registers, it can implement any other object in a wait-free manner. It can be shown that any object X with consensus number n is universal in a system with n or less processes


Download ppt "Concurrent Computing Seminar Introductory Lecture Instructor: Danny Hendler"

Similar presentations


Ads by Google