Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS194-24 Advanced Operating Systems Structures and Implementation Lecture 8 Synchronization Continued February 25 th, 2013 Prof. John Kubiatowicz

Similar presentations


Presentation on theme: "CS194-24 Advanced Operating Systems Structures and Implementation Lecture 8 Synchronization Continued February 25 th, 2013 Prof. John Kubiatowicz"— Presentation transcript:

1 CS194-24 Advanced Operating Systems Structures and Implementation Lecture 8 Synchronization Continued February 25 th, 2013 Prof. John Kubiatowicz http://inst.eecs.berkeley.edu/~cs194-24

2 Lec 8.2 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Goals for Today Synchronization Scheduling Interactive is important! Ask Questions! Note: Some slides and/or pictures in the following are adapted from slides ©2013

3 Lec 8.3 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Recall: Implementing Locks with test&set: Spin Lock The Test&Test&Set lock: int value = 0; // Free Acquire() { while (true) { while(value); // Locked, spin with reads if (!test&set(value)) break;// Success!} Release() { value = 0; } Significant problems with Test&Test&Set? –Multiple processors spinning on same memory location »Release/Reacquire causes lots of cache invalidation traffic »No guarantees of fairness – potential livelock –Scales poorly with number of processors »Because of bus traffic, average time until some processor acquires lock grows with number of processors Busy-Waiting: thread consumes cycles while waiting

4 Lec 8.4 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Better option: Exponential backoff The Test&Test&Set lock with backoff: int value = 0; // Free Acquire() { int nfail = 0 while (true) { while(value); // Locked, spin with reads if (!test&set(value)) break;// Success! else exponentialBackoff(nfail++)} Release() { value = 0; } If someone grabs lock between when we see it free and attempt to grab it, must be contention Backoff exponentially: –Choose backoff time randomly with some distribution –Double mean time with each iteration Why does this work?

5 Lec 8.5 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 compare&swap (&address, reg1, reg2) { /* 68000 */ if (reg1 == M[address]) { M[address] = reg2; return success; } else { return failure; } } Here is an atomic add to linked-list function: addToQueue(&object) { do {// repeat until no conflict ld r1, M[root]// Get ptr to current head st r1, M[object] // Save link in new object } until (compare&swap(&root,r1,object)); } Recall: Using of Compare&Swap for queues root next New Object

6 Lec 8.6 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Mellor-Crummey-Scott Lock Another lock-free option – Mellor-Crummey Scott (MCS): public void lock() { QNode qnode = myNode.get(); qnode.next = null; QNode pred = swap(tail,qnode); if (pred != null) { qnode.locked = true; pred.next = qnode; while (qnode.locked); // wait for predecessor } } tail next 1 unlocked pred 1 next 2 unlocked pred 2 locked next 3 unlocked pred 3 locked

7 Lec 8.7 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Mellor-Crummey-Scott Lock (con’t) public void unlock() { QNode qnode = myNode.get(); if (qnode.next == null) { if (compare&swap(tail,qnode,null)) return; // wait until predecessor fills in my next field while (qnode.next == null); } qnode.next.locked = false; } tail next 1 unlocked pred 1 next 2 unlocked pred 2 locked next 3 unlocked pred 3 locked

8 Lec 8.8 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Mellor-Crummey-Scott Lock (Con’t) Nice properties of MCS Lock –Never more than 2 processors spinning on one address –Completely fair – once on queue, are guaranteed to get your turn in FIFO order »Alternate release procedure doesn’t use compare&swap but doesn’t guarantee FIFO order Bad properties of MCS Lock –Takes longer (more instructions) than T&T&S if no contention –Releaser may be forced to spin in rare circumstances Hardware support? –Some proposed hardware queueing primitives such as QOLB (Queue on Lock Bit) –Not broadly available

9 Lec 8.9 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Comparison between options Lots of threads trying to: acquire(); wait for bit(); release(); Metric: Average time in critical section –Total # critical sections/total time Lesson: need something to handle high contention situations –Exponential backoff –Queue lock

10 Lec 8.10 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Reactive Locks Try to determine if you are in a high-contention or low- contention domain –Choose a protocol accordingly –Dynamic choice Why is this useful? –Because algorithms go through phases –Because different machines have different characteristics

11 Lec 8.11 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Higher-level Primitives than Locks What is the right abstraction for synchronizing threads that share memory? –Want as high a level primitive as possible Good primitives and practices important! –Since execution is not entirely sequential, really hard to find bugs, since they happen rarely –UNIX is pretty stable now, but up until about mid-80s (10 years after started), systems running UNIX would crash every week or so – concurrency bugs Synchronization is a way of coordinating multiple concurrent activities that are using shared state –This lecture and the next presents a couple of ways of structuring the sharing

12 Lec 8.12 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Recall: Semaphores Semaphores are a kind of generalized lock –First defined by Dijkstra in late 60s –Main synchronization primitive used in original UNIX Definition: a Semaphore has a non-negative integer value and supports the following two operations: –P(): an atomic operation that waits for semaphore to become positive, then decrements it by 1 »Think of this as the wait() operation –V(): an atomic operation that increments the semaphore by 1, waking up a waiting P, if any »This of this as the signal() operation –Note that P() stands for “proberen” (to test) and V() stands for “verhogen” (to increment) in Dutch

13 Lec 8.13 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Value=2Value=1Value=0 Semaphores Like Integers Except Semaphores are like integers, except –No negative values –Only operations allowed are P and V – can’t read or write value, except to set it initially –Operations must be atomic »Two P’s together can’t decrement value below zero »Similarly, thread going to sleep in P won’t miss wakeup from V – even if they both happen at same time Semaphore from railway analogy –Here is a semaphore initialized to 2 for resource control: Value=1Value=0Value=2

14 Lec 8.14 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Two Uses of Semaphores Mutual Exclusion (initial value = 1) –Also called “Binary Semaphore”. –Can be used for mutual exclusion: semaphore.P(); // Critical section goes here semaphore.V(); Scheduling Constraints (initial value = 0) –Locks are fine for mutual exclusion, but what if you want a thread to wait for something? –Example: suppose you had to implement ThreadJoin which must wait for thread to terminiate: Initial value of semaphore = 0 ThreadJoin { semaphore.P(); } ThreadFinish { semaphore.V(); }

15 Lec 8.15 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Producer-Consumer with a Bounded Buffer Problem Definition –Producer puts things into a shared buffer (wait if full) –Consumer takes them out (wait if empty) –Use a fixed-size buffer between them to avoid lockstep »Need to synchronize access to this buffer Correctness Constraints: –Consumer must wait for producer to fill buffers, if none full (scheduling constraint) –Producer must wait for consumer to empty buffers, if all full (scheduling constraint) –Only one thread can manipulate buffer queue at a time (mutual exclusion) Remember why we need mutual exclusion –Because computers are stupid General rule of thumb: Use a separate semaphore for each constraint –Semaphore fullBuffers; // consumer’s constraint –Semaphore emptyBuffers;// producer’s constraint –Semaphore mutex; // mutual exclusion

16 Lec 8.16 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Full Solution to Bounded Buffer Semaphore fullBuffer = 0; // Initially, no coke Semaphore emptyBuffers = numBuffers; // Initially, num empty slots Semaphore mutex = 1;// No one using machine Producer(item) { emptyBuffers.P();// Wait until space mutex.P();// Wait until buffer free Enqueue(item); mutex.V(); fullBuffers.V();// Tell consumers there is // more coke } Consumer() { fullBuffers.P();// Check if there’s a coke mutex.P();// Wait until machine free item = Dequeue(); mutex.V(); emptyBuffers.V();// tell producer need more return item; }

17 Lec 8.17 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Administrivia How did the first design document go? How did design reviews go? Recall: Midterm I: Two ½ weeks from today! –Wednesday 3/13 –Intention is a 1.5 hour exam over 3 hours –No class on day of exam! Midterm Timing: –Listed as 6:00-9:00PM –Could also be: 4:00-7:00pm –Preferences? (I need to get a room for the exam) Topics: everything up to the previous Monday –OS Structure, BDD, Process support, Synchronization, Scheduling, Memory Management, I/O

18 Lec 8.18 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Review: Definition of Monitor Semaphores are confusing because dual purpose: –Both mutual exclusion and scheduling constraints –Cleaner idea: Use locks for mutual exclusion and condition variables for scheduling constraints Monitor: a lock and zero or more condition variables for managing concurrent access to shared data –Use of Monitors is a programming paradigm Lock: provides mutual exclusion to shared data: –Always acquire before accessing shared data structure –Always release after finishing with shared data Condition Variable: a queue of threads waiting for something inside a critical section –Key idea: allow sleeping inside critical section by atomically releasing lock at time we go to sleep –Contrast to semaphores: Can’t wait inside critical section

19 Lec 8.19 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Review: Programming with Monitors Monitors represent the logic of the program –Wait if necessary –Signal when change something so any waiting threads can proceed Basic structure of monitor-based program: lock while (need to wait) { condvar.wait(); } unlock do something so no need to wait lock condvar.signal(); unlock Check and/or update state variables Wait if necessary Check and/or update state variables

20 Lec 8.20 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Readers/Writers Problem Motivation: Consider a shared database –Two classes of users: »Readers – never modify database »Writers – read and modify database –Is using a single lock on the whole database sufficient? »Like to have many readers at the same time »Only one writer at a time R R R W

21 Lec 8.21 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Basic Readers/Writers Solution Correctness Constraints: –Readers can access database when no writers –Writers can access database when no readers or writers –Only one thread manipulates state variables at a time Basic structure of a solution: –Reader() Wait until no writers Access data base Check out – wake up a waiting writer –Writer() Wait until no active readers or writers Access database Check out – wake up waiting readers or writer –State variables (Protected by a lock called “lock”): »int AR: Number of active readers; initially = 0 »int WR: Number of waiting readers; initially = 0 »int AW: Number of active writers; initially = 0 »int WW: Number of waiting writers; initially = 0 »Condition okToRead = NIL »Conditioin okToWrite = NIL

22 Lec 8.22 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Code for a Reader Reader() { // First check self into system lock.Acquire(); while ((AW + WW) > 0) {// Is it safe to read? WR++;// No. Writers exist okToRead.wait(&lock);// Sleep on cond var WR--;// No longer waiting } AR++;// Now we are active! lock.release(); // Perform actual read-only access AccessDatabase(ReadOnly); // Now, check out of system lock.Acquire(); AR--;// No longer active if (AR == 0 && WW > 0)// No other active readers okToWrite.signal();// Wake up one writer lock.Release(); }

23 Lec 8.23 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Writer() { // First check self into system lock.Acquire(); while ((AW + AR) > 0) {// Is it safe to write? WW++;// No. Active users exist okToWrite.wait(&lock);// Sleep on cond var WW--;// No longer waiting } AW++;// Now we are active! lock.release(); // Perform actual read/write access AccessDatabase(ReadWrite); // Now, check out of system lock.Acquire(); AW--;// No longer active if (WW > 0){// Give priority to writers okToWrite.signal();// Wake up one writer } else if (WR > 0) {// Otherwise, wake reader okToRead.broadcast();// Wake all readers } lock.Release(); } Code for a Writer

24 Lec 8.24 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 C-Language Support for Synchronization C language: Pretty straightforward synchronization –Just make sure you know all the code paths out of a critical section int Rtn() { lock.acquire(); … if (exception) { lock.release(); return errReturnCode; } … lock.release(); return OK; } –Watch out for setjmp / longjmp ! »Can cause a non-local jump out of procedure »In example, procedure E calls longjmp, poping stack back to procedure B »If Procedure C had lock.acquire, problem! Proc A Proc B Calls setjmp Proc C lock.acquire Proc D Proc E Calls longjmp Stack growth

25 Lec 8.25 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 C++ Language Support for Synchronization (con’t) Must catch all exceptions in critical sections –Catch exceptions, release lock, and re-throw exception: void Rtn() { lock.acquire(); try { … DoFoo(); … } catch (…) {// catch exception lock.release();// release lock throw; // re-throw the exception } lock.release(); } void DoFoo() { … if (exception) throw errException; … } –Even Better: auto_ptr facility. See C++ Spec. »Can deallocate/free lock regardless of exit method

26 Lec 8.26 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Problem: Busy-Waiting for Lock Positives for this solution –Machine can receive interrupts –User code can use this lock –Works on a multiprocessor Negatives –This is very inefficient because the busy-waiting thread will consume cycles waiting –Waiting thread may take cycles away from thread holding lock (no one wins!) –Priority Inversion: If busy-waiting thread has higher priority than thread holding lock  no progress! Priority Inversion problem with original Martian rover For semaphores and monitors, waiting thread may wait for an arbitrary length of time! –Thus even if busy-waiting was OK for locks, definitely not ok for other primitives –Exam solutions should not have busy-waiting!

27 Lec 8.27 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Busy-wait vs Blocking Busy-wait: I.e. spin lock –Keep trying to acquire lock until read –Very low latency/processor overhead! –Very high system overhead! »Causing stress on network while spinning »Processor is not doing anything else useful Blocking: –If can’t acquire lock, deschedule process (I.e. unload state) –Higher latency/processor overhead (1000s of cycles?) »Takes time to unload/restart task »Notification mechanism needed –Low system overheadd »No stress on network »Processor does something useful Hybrid: –Spin for a while, then block –2-competitive: spin until have waited blocking time

28 Lec 8.28 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 What about barriers? Barrier – global (/coordinated) synchronization –simple use of barriers -- all threads hit the same one work_on_my_subgrid(); barrier(); read_neighboring_values(); barrier(); –barriers are not provided in all thread libraries How to implement barrier? –Global counter representing number of threads still waiting to arrive and parity representing phase »Initialize counter to zero, set parity variable to “even” »Each thread that enters saves parity variable and Atomically increments counter if even Atomically decrements counter if odd »If counter not at extreme value spin until parity changes i.e. Num threads if “even” or zero if “odd” »Else, flip parity, exit barrier –Better for large numbers of processors – implement atomic counter via combining tree

29 Lec 8.29 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Review: CPU Scheduling Earlier, we talked about the life-cycle of a thread –Active threads work their way from Ready queue to Running to various waiting queues. Question: How is the OS to decide which of several tasks to take off a queue? –Obvious queue to worry about is ready queue –Others can be scheduled as well, however Scheduling: deciding which threads are given access to resources from moment to moment

30 Lec 8.30 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Scheduling Assumptions CPU scheduling big area of research in early 70’s Many implicit assumptions for CPU scheduling: –One program per user –One thread per program –Programs are independent Clearly, these are unrealistic but they simplify the problem so it can be solved –For instance: is “fair” about fairness among users or programs? »If I run one compilation job and you run five, you get five times as much CPU on many operating systems The high-level goal: Dole out CPU time to optimize some desired parameters of system USER1USER2USER3USER1USER2 Time

31 Lec 8.31 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Assumption: CPU Bursts Execution model: programs alternate between bursts of CPU and I/O –Program typically uses the CPU for some period of time, then does I/O, then uses CPU again –Each scheduling decision is about which job to give to the CPU for use by its next CPU burst –With timeslicing, thread may be forced to give up CPU before finishing current CPU burst Weighted toward small bursts

32 Lec 8.32 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Scheduling Policy Goals/Criteria Minimize Response Time –Minimize elapsed time to do an operation (or job) –Response time is what the user sees: »Time to echo a keystroke in editor »Time to compile a program »Real-time Tasks: Must meet deadlines imposed by World Maximize Throughput –Maximize operations (or jobs) per second –Throughput related to response time, but not identical: »Minimizing response time will lead to more context switching than if you only maximized throughput –Two parts to maximizing throughput »Minimize overhead (for example, context-switching) »Efficient use of resources (CPU, disk, memory, etc) Fairness –Share CPU among users in some equitable way –Fairness is not minimizing average response time: »Better average response time by making system less fair

33 Lec 8.33 2/25/13Kubiatowicz CS194-24 ©UCB Fall 2013 Summary (Synchronization) Important concept: Atomic Operations –An operation that runs to completion or not at all –These are the primitives on which to construct various synchronization primitives Talked about hardware atomicity primitives: –Disabling of Interrupts, test&set, swap, comp&swap, load-linked/store conditional Showed several constructions of Locks –Must be very careful not to waste/tie up machine resources »Shouldn’t disable interrupts for long »Shouldn’t spin wait for long –Key idea: Separate lock variable, use hardware mechanisms to protect modifications of that variable Talked about Semaphores, Monitors, and Condition Variables –Higher level constructs that are harder to “screw up”


Download ppt "CS194-24 Advanced Operating Systems Structures and Implementation Lecture 8 Synchronization Continued February 25 th, 2013 Prof. John Kubiatowicz"

Similar presentations


Ads by Google