Synchronization II Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P&H Chapter 2.11.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Ch 7 B.
Silberschatz, Galvin and Gagne ©2007 Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Chapter 6 (a): Synchronization.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.
CS444/CS544 Operating Systems Synchronization 2/16/2006 Prof. Searleman
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Parallel Programming and Synchronization P&H Chapter 2.11.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Atomic Instructions P&H Chapter 2.11.
Atomic Instructions Hakim Weatherspoon CS 3410, Spring 2011 Computer Science Cornell University P&H Chapter 2.11.
Race Conditions Critical Sections Deker’s Algorithm.
CS 316: Synchronization-II Andrew Myers Fall 2007 Computer Science Cornell University.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
Semaphores. Announcements No CS 415 Section this Friday Tom Roeder will hold office hours Homework 2 is due today.
Language Support for Concurrency. 2 Common programming errors Process i P(S) CS P(S) Process j V(S) CS V(S) Process k P(S) CS.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
Jonathan Walpole Computer Science Portland State University
Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P&H Chapter 2.11.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Synchronization CSCI 444/544 Operating Systems Fall 2008.
CS510 Concurrent Systems Introduction to Concurrency.
Concurrency, Mutual Exclusion and Synchronization.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
4061 Session 21 (4/3). Today Thread Synchronization –Condition Variables –Monitors –Read-Write Locks.
Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P&H Chapter 2.11, 5.10, and 6.5.
Synchronization Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University P&H Chapter 2.11 and 5.8.
Hakim Weatherspoon CS 3410, Spring 2015 Computer Science Cornell University P&H Chapter 2.11.
1 Interprocess Communication (IPC) - Outline Problem: Race condition Solution: Mutual exclusion –Disabling interrupts; –Lock variables; –Strict alternation.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
Synchronization II Hakim Weatherspoon CS 3410, Spring 2013 Computer Science Cornell University P&H Chapter 2.11.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Multiprocessors – Locks
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Outline Introduction Centralized shared-memory architectures (Sec. 5.2) Distributed shared-memory and directory-based coherence (Sec. 5.4) Synchronization:
CS703 – Advanced Operating Systems
Background on the need for Synchronization
Atomic Operations in Hardware
Atomic Operations in Hardware
Operating Systems CMPSC 473
Synchronization II Hakim Weatherspoon CS 3410, Spring 2012
Synchronization II Hakim Weatherspoon CS 3410, Spring 2013
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Multicore, Parallelism, and Synchronization
Lecture 21: Synchronization and Consistency
Lecture: Coherence and Synchronization
CSCI1600: Embedded and Real Time Software
CSE 451: Operating Systems Autumn Lecture 8 Semaphores and Monitors
Kernel Synchronization II
Parallelism, Multicore, and Synchronization
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
CSCI1600: Embedded and Real Time Software
Lecture: Coherence and Synchronization
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Synchronization II Hakim Weatherspoon CS 3410, Spring 2012 Computer Science Cornell University P&H Chapter 2.11

2 Administrivia Pizza party: PA3 Games Night Friday, April 27 th, 5:00-7:00pm Location: Upson B17 Prelim3 Review Today, Tuesday, April 24 th, 5:30-7:30pm Location: Hollister 110 Prelim 3 Thursday, April 26 th, 7:30pm Location: Olin 155 PA4: Final project out next week Demos: May Will not be able to use slip days

3 Goals for Today Synchronization Threads and processes Critical sections, race conditions, and mutexes Atomic Instructions HW support for synchronization Using sync primitives to build concurrency-safe data structures Cache coherency causes problems Locks + barriers Language level synchronization

4 Synchronization Two processors sharing an area of memory P1 writes, then P2 reads Data race if P1 and P2 don’t synchronize –Result depends of order of accesses Hardware support required Atomic read/write memory operation No other access to the location allowed between the read and write Could be a single instruction E.g., atomic swap of register ↔ memory (e.g. ATS, BTS; x86) Or an atomic pair of instructions (e.g. LL and SC; MIPS)

5 Synchronization in MIPS Load linked: LL rt, offset(rs) Store conditional: SC rt, offset(rs) Succeeds if location not changed since the LL –Returns 1 in rt Fails if location is changed –Returns 0 in rt Example: atomic swap (to test/set lock variable) try: MOVE $t0,$s4;copy exchange value LL $t1,0($s1);load linked SC $t0,0($s1);store conditional BEQZ $t0,try;branch store fails MOVE $s4,$t1;put load value in $s4

6 Need it to exploit multiple processing units …to provide interactive applications …to parallelize for multicore …to write servers that handle many clients Problem: hard even for experienced programmers Behavior can depend on subtle timing differences Bugs may be impossible to reproduce Needed: synchronization of threads Programming with Threads

7 Concurrency poses challenges for: Correctness Threads accessing shared memory should not interfere with each other Liveness Threads should not get stuck, should make forward progress Efficiency Program should make good use of available computing resources (e.g., processors). Fairness Resources apportioned fairly between threads

8 Two threads, one counter Example: Web servers use concurrency Multiple threads handle client requests in parallel. Some shared state, e.g. hit counts: each thread increments a shared counter to track number of hits What happens when two threads execute concurrently? … hits = hits + 1; … LW R0, hitsloc ADDI R0, r0, 1 SW R0, hitsloc …

9 Shared counters Possible result: lost update! Timing-dependent failure  race condition hard to reproduce  Difficult to debug addu/sw: hits = lw (0) addu/sw : hits = lw (0) T1 T2 hits = 1 hits = 0 time

10 Race conditions Def: timing-dependent error involving access to shared state Whether it happens depends on how threads scheduled: who wins “races” to instruction that updates state vs. instruction that accesses state Races are intermittent, may occur rarely –Timing dependent = small changes can hide bug A program is correct only if all possible schedules are safe –Number of possible schedule permutations is huge –Need to imagine an adversary who switches contexts at the worst possible time

11 Critical sections To eliminate races: use critical sections that only one thread can be in Contending threads must wait to enter CSEnter(); Critical section CSExit(); T1 T2 time CSEnter(); Critical section CSExit(); T1 T2

12 Mutexes Critical sections typically associated with mutual exclusion locks (mutexes) Only one thread can hold a given mutex at a time Acquire (lock) mutex on entry to critical section Or block if another thread already holds it Release (unlock) mutex on exit Allow one waiting thread (if any) to acquire & proceed pthread_mutex_lock(&m); hits = hits+1; pthread_mutex_unlock(&m); T1 T2 pthread_mutex_lock(&m); hits = hits+1; pthread_mutex_unlock(&m); pthread_mutex_init(&m);

13 Mutexes Q: How to implement critical section in code? A: Lots of approaches…. Mutual Exclusion Lock (mutex) lock(m): wait till it becomes free, then lock it unlock(m): unlock it safe_increment() { pthread_mutex_lock(&m); hits = hits + 1; pthread_mutex_unlock(&m) }

14 Hardware Support for Synchronization

15 Synchronization in MIPS Load linked: LL rt, offset(rs) Store conditional: SC rt, offset(rs) Succeeds if location not changed since the LL –Returns 1 in rt Fails if location is changed –Returns 0 in rt Example: atomic swap (to test/set lock variable) try: MOVE $t0,$s4;copy exchange value LL $t1,0($s1);load linked SC $t0,0($s1);store conditional BEQZ $t0,try;branch store fails MOVE $s4,$t1;put load value in $s4

16 Mutex from LL and SC Linked load / Store Conditional mutex_lock(int *m) { while(test_and_test(m)){} } int test_and_set(int *m) { old = *m; *m = 1; return old; }

17 Mutex from LL and SC Linked load / Store Conditional mutex_lock(int *m) { while(test_and_test(m)){} } int test_and_set(int *m) { LI $t0, 1 LL $t1, 0($a0) SC $t0, 0($a0) MOVE $v0, $t1 }

18 Mutex from LL and SC Linked load / Store Conditional mutex_lock(int *m) { test_and_set: LI $t0, 1 LL $t1, 0($a0) BNEZ $t1, test_and_set SC $t0, 0($a0) BEQZ $t0, test_and_set } mutex_unlock(int *m) { *m = 0; }

19 Mutex from LL and SC Linked load / Store Conditional mutex_lock(int *m) { test_and_set: LI $t0, 1 LL $t1, 0($a0) BNEZ $t1, test_and_set SC $t0, 0($a0) BEQZ $t0, test_and_set } mutex_unlock(int *m) { SW $zero, 0($a0) }

20 Alternative Atomic Instructions Other atomic hardware primitives - test and set (x86) - atomic increment (x86) - bus lock prefix (x86)

21 Alternative Atomic Instructions Other atomic hardware primitives - test and set (x86) - atomic increment (x86) - bus lock prefix (x86) - compare and exchange (x86, ARM deprecated) - linked load / store conditional (MIPS, ARM, PowerPC, DEC Alpha, …)

22 Synchronization Synchronization techniques clever code must work despite adversarial scheduler/interrupts used by: hackers also: noobs disable interrupts used by: exception handler, scheduler, device drivers, … disable preemption dangerous for user code, but okay for some kernel code mutual exclusion locks (mutex) general purpose, except for some interrupt-related cases

23 Using synchronization primitives to build concurrency-safe datastructures

24 Broken invariants Access to shared data must be synchronized goal: enforce datastructure invariants // invariant: // data is in A[h … t-1] char A[100]; int h = 0, t = 0; // producer: add to list tail void put(char c) { A[t] = c; t++; } // consumer: take from list head char get() { while (h == t) { }; char c = A[h]; h++; return c; } 123 head tail

25 Protecting an invariant Rule of thumb: all updates that can affect invariant become critical sections // invariant: (protected by m) // data is in A[h … t-1] pthread_mutex_t *m = pthread_mutex_create(); char A[100]; int h = 0, t = 0; // producer: add to list tail void put(char c) { pthread_mutex_lock(m); A[t] = c; t++; pthread_mutex_unlock(m); } // consumer: take from list head char get() { pthread_mutex_lock(m); while(h == t) {} char c = A[h]; h++; pthread_mutex_unlock(m); return c; }

26 Guidelines for successful mutexing Insufficient locking can cause races Skimping on mutexes? Just say no! Poorly designed locking can cause deadlock know why you are using mutexes! acquire locks in a consistent order to avoid cycles use lock/unlock like braces (match them lexically) –lock(&m); …; unlock(&m) –watch out for return, goto, and function calls! –watch out for exception/error conditions! P1:lock(m1); lock(m2); P2:lock(m2); lock(m1);

27 Cache Coherency causes yet more trouble

28 Remember: Cache Coherence Recall: Cache coherence defined... Informal: Reads return most recently written value Formal: For concurrent processes P 1 and P 2 P writes X before P reads X (with no intervening writes)  read returns written value P 1 writes X before P 2 reads X  read returns written value P 1 writes X and P 2 writes X  all processors see writes in the same order –all see the same final value for X

29 Relaxed consistency implications Ideal case: sequential consistency Globally: writes appear in interleaved order Locally: other core’s writes show up in program order In practice: not so much… write-back caches  sequential consistency is tricky writes appear in semi-random order locks alone don’t help * MIPS has sequential consistency; Intel does not

30 Acquire/release Memory Barriers and Release Consistency Less strict than sequential consistency; easier to build One protocol: Acquire: lock, and force subsequent accesses after Release: unlock, and force previous accesses before P1:... acquire(m); A[t] = c; t++; release(m); P2:... acquire(m); A[t] = c; t++; unlock(m); Moral: can’t rely on sequential consistency (so use synchronization libraries)

31 Are Locks + Barriers enough?

32 Beyond mutexes Writers must check for full buffer & Readers must check if for empty buffer ideal: don’t busy wait… go to sleep instead char get() { acquire(L); char c = A[h]; h++; release(L); return c; } head last==head empty while(empty) {}

33 Beyond mutexes Writers must check for full buffer & Readers must check if for empty buffer ideal: don’t busy wait… go to sleep instead char get() { acquire(L); char c = A[h]; h++; release(L); return c; } char get() { acquire(L); while (h == t) { }; char c = A[h]; h++; release(L); return c; } char get() { while (h == t) { }; acquire(L); char c = A[h]; h++; release(L); return c; } Dilemma: Have to check while holding lock, head last==head empty

34 Beyond mutexes Writers must check for full buffer & Readers must check if for empty buffer ideal: don’t busy wait… go to sleep instead char get() { acquire(L); char c = A[h]; h++; release(L); return c; } char get() { acquire(L); while (h == t) { }; char c = A[h]; h++; release(L); return c; } Dilemma: Have to check while holding lock, but cannot wait while hold lock

35 Beyond mutexes Writers must check for full buffer & Readers must check if for empty buffer ideal: don’t busy wait… go to sleep instead char get() { do { acquire(L); empty = (h == t); if (!empty) { c = A[h]; h++; } release(L); } while (empty); return c; }

36 Language-level Synchronization

37 Condition variables Use [Hoare] a condition variable to wait for a condition to become true (without holding lock!) wait(m, c) : atomically release m and sleep, waiting for condition c wake up holding m sometime after c was signaled signal(c) : wake up one thread waiting on c broadcast(c) : wake up all threads waiting on c POSIX (e.g., Linux): pthread_cond_wait, pthread_cond_signal, pthread_cond_broadcast

38 Using a condition variable wait(m, c) : release m, sleep until c, wake up holding m signal(c) : wake up one thread waiting on c char get() { lock(m); while (t == h) wait(m, not_empty); char c = A[h]; h = (h+1) % n; unlock(m); signal(not_full); return c; } cond_t *not_full =...; cond_t *not_empty =...; mutex_t *m =...; void put(char c) { lock(m); while ((t-h) % n == 1) wait(m, not_full); A[t] = c; t = (t+1) % n; unlock(m); signal(not_empty); }

39 Monitors A Monitor is a concurrency-safe datastructure, with… one mutex some condition variables some operations All operations on monitor acquire/release mutex one thread in the monitor at a time Ring buffer was a monitor Java, C#, etc., have built-in support for monitors

40 Java concurrency Java objects can be monitors “synchronized” keyword locks/releases the mutex Has one (!) builtin condition variable –o.wait() = wait(o, o) –o.notify() = signal(o) –o.notifyAll() = broadcast(o) Java wait() can be called even when mutex is not held. Mutex not held when awoken by signal(). Useful?

41 More synchronization mechanisms Lots of synchronization variations… (can implement with mutex and condition vars.) Reader/writer locks Any number of threads can hold a read lock Only one thread can hold the writer lock Semaphores N threads can hold lock at the same time Message-passing, sockets, queues, ring buffers, … transfer data and synchronize

42 Summary Hardware Primitives: test-and-set, LL/SC, barrier,... … used to build … Synchronization primitives: mutex, semaphore,... … used to build … Language Constructs: monitors, signals,...