Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mutual Exclusion.

Similar presentations


Presentation on theme: "Mutual Exclusion."— Presentation transcript:

1 Mutual Exclusion

2 Overview Concurrent programming and race conditions Mutual exclusion
Implementing mutual exclusion Deadlocks, starvation, livelock

3 Concurrent Programming
Programming with two or more threads that cooperate to perform a common task Threads cooperate by sharing data via shared address space What types of data/variables are shared? Problems Race Conditions E.g., Two threads T1 and T2 read and update the same variable, so access to the threads must be exclusive (i.e. one at a time) Synchronization E.g., T1 initializes a variable, T2 runs after variable is initialized, so ordering between T1 and T2 must be enforced variables that are shared: globals and heap variables

4 Race Condition Example
What thread interleaving would lead to problems? worker() { …; counter = counter + 1; … } Dump of assembler code for function worker: 0x <+15>: mov 0x406018,%eax ; 1. read from mem 0x d <+20>: add $0x1,%eax ; 2. increment reg 0x004013a0 <+23>: mov %eax,0x ; 3. write to mem Thread 1 worker() { …; counter = counter + 1; … } Dump of assembler code for function worker: 0x <+15>: mov 0x406018,%eax ; 1. read from mem 0x d <+20>: add $0x1,%eax ; 2. increment reg 0x004013a0 <+23>: mov %eax,0x ; 3. write to mem Thread 2 Say Thread 1 start executing step 1. Before it can complete step 3, assume that a thread switch occurs and Thread 2 starts running, and executes its step 1. Then both threads have read the old value of counter. Only one of the increments will then be recorded, i.e., one of the increments will be lost.

5 Why do Races Occur? Result depends on timing of execution of threads
Some execution sequences lead to unexpected results How can we avoid this problem? Ri is read of counter value, Wi is write of counter value, CS is context switch solve this problem: need to ensure that operation appears atomic, see next slide a special kind of a race condition is a data race: two threads access a shared variable concurrently, at least one access is a write, and the threads do not use any synchronization to control access to the variable. the example shown in this slide is a data race. race conditions can be of other types as well.

6 Atomicity and Mutual Exclusion
Need to ensure that reading and updating the counter is an atomic operation An operation is atomic if it appears to occur instantaneously to the rest of the system The operation appears indivisible, so the rest of the system either doesn’t observe any of the effects of the operation, or all the effects of the operation One way to ensure atomicity is by ensuring that only one thread can read and update the counter at a time This is called mutual exclusion The code region on which mutual exclusion is enforced is called a critical section Accesses to shared variables must be done in critical sections The reason we say that an operation appears to occur instantaneously is that it doesn’t occur instantaneously. For instance, other threads could run while the operation is running. Put another way, the other threads could run but they should not observe any intermediate states of the operation.

7 Mutex Lock Abstraction
A mutex lock helps ensures mutual exclusion mutex = lock_create(): create a free lock, called mutex lock_destroy(mutex): destroy the mutex lock lock(mutex): Acquire the lock if it is free Otherwise wait (or sleep) until it can be acquired Lock is now acquired unlock(mutex): Release the lock If there are waiting threads wake up one of them Lock is now free Critical section is accessed in between lock, unlock A toilet is a critical section! You don’t want any races there … You acquire a lock by closing the door, and release the lock by opening the door.

8 Mutex Locks Lock Acquired Unlock Lock Acquired Unlock

9 Using a Mutex Lock Thread 1 Thread 2
// counter and lock are located in shared address space int counter; struct lock *l; // same lock used by both threads while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } Using locks: it is the programmer’s responsibility to add locks to critical sections. The code would not work if we added two different locks, e.g., say T1 uses lock L1, and T2 uses lock L2. What if T2 performs “counter—”? Should we use the same lock L, for both threads or not? Yes, we should because we would like to protect both sections of code. In general, it is important to think about shared data structures (globals and heap) that you are trying to protect, and then create a single lock for a specific data structure (or component of data structure, e.g., linked list item) that can be read and modified by multiple threads concurrently, and then use the lock in all code that accesses that data structure. Thread 1 Thread 2

10 Mutual Exclusion Conditions
No two threads simultaneously in critical section No assumption on the speed of thread execution No thread running outside its critical section may block another thread Why? No thread must wait forever to enter its critical section why: bad for performance. presumably a thread outside the critical section is not doing anything critical, and so should not stop a thread in a critical section. Also, if a thread in a critical section is blocked, then all threads waiting to enter the critical section also get blocked. why: starvation, no progress is made

11 Implementing Mutex Locks
Naive implementation: use a global variable (int l) to track whether a thread is in the critical section Is there a problem with this implementation? lock() and unlock() access a shared variable So they themselves need to be atomic! lock(l) { while (l == TRUE) ; // no-op l = TRUE; } unlock(l) { l = FALSE; } problem: One thread invokes lock(). It read checks (l == TRUE), and find it is false. Before it can set l to TRUE, there is a context switch, and another thread invokes lock(). It will also find l is false, and then will set to TRUE and acquire the lock. Then the first thread will again set l to TRUE and also acquire the lock. So no mutual exclusion.

12 Implementing Mutex Locks
Naive implementation: make lock() atomic Disabling interrupt ensures that pre-emption doesn’t occur in the lock() code, ensuring it runs atomically Is there a problem with this implementation? lock(l) { disable interrupts; while (l == TRUE) ; // no-op l = TRUE; enable interrupts; } unlock(l) { l = FALSE; } we can’t seem to implement locks by reading and writing the lock variable, so let’s give up and try using another option, interrupt disabling. problem: one thread acquires a lock, but before it performs an unlock, say another thread runs (note that interrupts are enabled between lock() and unlock() calls. The second thread will disable interrupts, and get stuck in the while loop because the first thread has set the lock variable to true. Since interrupts are disabled, the original thread doesn’t get to run, and so now both threads are stuck forever. this is called a deadlock.

13 Implementation 1: Interrupt Disabling
What about this implementation? Any problem with this implementation? lock() { disable_interrupts; } unlock() { enable_interrupts; } this implementation works on a single CPU because the critical section is executed atomically without preemption. In practice, the implementation of unlock() will set the interrupt level to its value before the call to lock(). Suppose interrupts are enabled and lock is called twice. Then two unlocks are needed to enable interrupts. problem: doesn’t work for multi-processors, as we see next.

14 Atomic Instructions Previous implementation only works on single CPU
Interrupts are disabled only on local CPU But threads could still run on another CPU, causing a race Hardware support for locking Interrupts provide h/w support for locking on single CPU Need h/w support for locking on multi-processors Multi-processor h/w provides atomic instructions Atomic Increment, Atomic Test and Set Lock, Atomic Compare and Swap These instructions operate on a memory word Notice they perform 2 operations on the word indivisibly How does h/w performs these operations indivisibly? Show the assembly of worker function in threads-atomic.exe which uses an atomic increment instruction. Note that threads_sync.exe uses pthread locks (these are blocking locks that we will discuss later). How does h/w performs these operations indivisibly? CPU requests memory controller to lock memory location, so the two operations on the memory location are performed without other CPUs interfering. in essence, to build a lock primitive in software, we are again using a lock/atomic primitive provided by h/w (previously, we used interrupt disabling for uniprocessor, now we are using atomic instructions for multiprocessors).

15 Test-and-Set Lock Instruction
Tset instruction operates on an integer It reads and returns the old value of the integer It updates the value of the integer to 1 These two operations are performed atomically int tset(int *lock) { // atomic in hardware int old = *lock; *lock = 1; return old; } While atomic instructions such as atomic increment allow performing specific operations (increment counter) atomically, the tset instruction is a general atomic instruction for implementing spin locks, as described in the next slide.

16 Implementation 2: Spin Locks
Lock uses tset in a loop *l is initialized to 0 If returned value is 0, lock is acquired If returned value is 1, then someone else has lock, try again This mutex lock is called a spin lock because threads wait in a tight loop Problem: While a thread waits, CPU performs no useful work lock(int *l) { while (tset(l)) ; // no-op } unlock(int *l) { *l = FALSE; } Spin locks allow implementing an arbitrary size critical section, by surrounding the critical section code with lock and unlock.

17 Implementation 3: Yielding Locks
Yield the CPU voluntarily while waiting for the lock Recall that thread_yield runs another thread, so the CPU can perform useful work This mutex is a yielding lock Problem: scheduler determines when thread_yield() returns lock_s(int *l) { while (tset(l)) thread_yield(); } unlock_s(int *l) { *l = FALSE; } Thread might get unlucky and wait for a long time in thread_yield(), even if the the other thread releases the lock soon after the call to yield.

18 Implementation 4: Blocking Locks
Both spin and yielding locks are essentially polling for lock to become available Choosing right polling frequency is not simple: spin locks waste CPU, yielding locks can delay lock acquire Ideally, lock() would block until unlock() was called Invoke thread_sleep() when lock is not available Unlock() should invoke thread_wakeup() These functions access shared ready list, so they need to be critical sections, i.e., we need locking while trying to implement blocking locks! How can we solve this problem? lock() will block until unlock() was called: this has similarities with interrupts helping determine when a device has completed work, and letting the CPU know at that time. In this case, while lock() is blocked, the CPU can do work on behalf of other ready threads.

19 Using a Previous Solution
Previous solutions work correctly but don’t block Interrupt disabling works correctly on single CPU Spin locks work correctly on multi-processor We can use these solutions to access the shared data structures in the thread scheduler Scheduler implements blocking, so it can’t use a blocking lock! Lab 3 requires you to implement blocking locks

20 Locking Solutions Notice how locking solutions depend on lower-level locking The lock implementation of lower-level locks is more efficient, so why use higher-level locks? Uniprocessor Multiprocessor blocking lock blocking lock interrupt disabling spin lock For example, an atomic instruction is a single instruction, spin locks loop on an atomic instruction, while blocking locks require manipulating the ready list, wait list, etc. atomic instruction

21 Which Lock to Use? Lock When to use Atomic instruction
Most efficient, use when available Interrupt disabling, spin locks Use when critical sections are short, in particular, the critical section will not block (i.e., it will not call thread_sleep or thread_yield) Blocking locks Use when critical sections are long, especially if the critical section may block Use when available: say that all we want to do is to increment a counter atomically. Since an atomic instruction is available, we can use it. Why would you block in a critical section: we will see later that we may need to block in a critical section because we need to synchronize with other threads.

22 Using Locks Note that to protect shared variables, we need to create lock variables that are also shared variables Locks must be global variables or allocated on the heap We have talked about how locks are implemented, now lets see how they are used?

23 Using Locks When using locks, make sure to use the same lock for all critical sections that access some shared data // counter and lock are located in shared address space int counter; struct lock *l; while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } while() { lock(l); // critical section; counter--; unlock(l); // remainder section; } If Thread 1 and Thread 2 used different locks while accessing the counter variable, then they could run concurrently, possibly corrupting counter. Thread 1 Thread 2

24 Using Locks Say, multiple threads access a linked list
One thread adds elements to the list Another thread deletes elements from the list Should the add and delete code use the same lock or different lock? How many lock variables should be created? We could create one lock for the entire list, or we could create one lock per list node More locks allow more concurrency but more potential for bugs Let’s see one kind of bug when using multiple locks Same or different lock: same lock because they are accessing the same data structure More concurrency but more bugs: for example, when using a separate lock per node, two threads can updates two different nodes of the list concurrently, however adding and removing elements becomes more tricky because one needs to acquire multiple locks to perform the list update correctly.

25 Deadlocks A set of threads is deadlocked if each thread is waiting for a resource (an event) that some another thread in the set holds (can perform) So no thread can run Breaking deadlocks generally requires killing threads Thread_A() { lock(resource_2); lock(resource_1); use resource 1 and 2; unlock(resource_1); unlock(resource_2); } Thread_B() { lock(resource_1); lock(resource_2); use resource 1 and 2; unlock(resource_2); unlock(resource_1); } Databases have deadlocks – however, they need heavy machinery to deal with them

26 Deadlock Conditions A deadlock situation can occur if and only if the following conditions hold simultaneously Mutual exclusion – each resource is assigned to one thread Hold and wait – threads can get more than one resource No preemption – acquired resources cannot be preempted Circular wait – threads form a circular chain, each waiting for a resource from the next thread in chain

27 Examples of Deadlock Mahjong Gridlock

28 Detecting Deadlocks Deadlocks can be detected using wait-for graphs
Deadlock  Cycle in the wait-for graph requests Thread P1 R2 Resource holds R1 P2 Resource

29 Preventing Deadlocks Avoid hold and wait Prevent circular wait
If a lock is unavailable, release previously acquired locks, and try to reacquire all locks again What are the problems with this approach? Prevent circular wait Number each of the resources Require each thread to acquire lower numbered resources before higher numbered resources Problems? problems: 1. can cause livelock (no progress) since we may keep retrying to acquire locks, 2. need to make sure that any data structure changes made within previous locks are undone before releasing locks. in practice, this is more tricky than it sounds. hard to use third-party software, since it is hard to number all resources if software comes from different sources

30 Deadlock, Starvation, Livelock
A particular set of threads perform no work because of a circular wait condition Once a deadlock occurs, it does not go away Starvation A particular set of threads perform no work because the resources they need are being used by others constantly Starvation can be a temporary condition Livelock A set of threads continue to run but make no progress! Examples include interrupt livelock How can we solve interrupt livelock? Need to disable interrupts and switch to polling when interrupts are arriving too often, think about how you check , text messages, etc.

31 Summary Concurrent programming model Races
Threads enable concurrent execution Threads cooperate by accessing shared variables Races Concurrent accesses to shared variables can lead to races, i.e., incorrect execution under some thread interleavings Critical sections and mutual exclusion Avoiding races requires defining critical code sections that are run atomically (indivisibly) using mutual exclusion, i.e., only one thread accesses the critical section at a time Mutual exclusion is implemented using locks Locking requires h/w support (interrupts, atomic instructions)

32 Think Time What is a race condition?
How can we protect against race conditions? Can locks be implemented by reading and writing to a binary variable? Why is it better to block rather than spin on a uniprocessor? Why is a blocking lock better than interrupt disabling or using spin locks? Is the blocking lock always better? race condition: when certain thread interleavings are possible that lead to incorrect results protect against races: certain code needs to be executed atomically/indivisibly locks with reading/writing binary var: no, the read and write need to be performed atomically. better to block: with spinning, no progress is possible since there is only one processor blocking lock is better: Interrupt disabling is performed for the entire critical section. Similarly, spin locks are held for the entire critical section. With blocking locks, interrupts are disabled or spin locks are held only in the blocking lock implementation, not for entire critical section E.g., if critical section is accessing disk, then it would be best to put thread to sleep and let other threads run on the same processor. is blocking lock always better: blocking locks have overhead because they call the thread scheduler to perform a thread switch. when the critical section is short (say a few instructions), it is better to use interrupt disabling (on single CPU) or spin locks (on multi-processor) because the blocking lock has high overhead for short critical sections.

33 Think Time How can one avoid starvation? How can one avoid livelock?
avoiding starvation: use fifo queuing whenever a thread needs to wait on a condition (e.g., while trying to acquire a lock), so the first thread to wait on the condition gets served first (e.g., the lock implementation can use a wait queue from which threads acquire locks in fifo order). avoiding livelock: we need to ensure that threads run for a while before switching to running some other thread


Download ppt "Mutual Exclusion."

Similar presentations


Ads by Google