1 OMSE 510: Computing Foundations 6: Multithreading Chris Gilmore Portland State University/OMSE Material Borrowed from Jon Walpole’s lectures.

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
EEE 435 Principles of Operating Systems Interprocess Communication Pt II (Modern Operating Systems 2.3)
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Mutual Exclusion.
1 Threads CSCE 351: Operating System Kernels Witawas Srisa-an Chapter 4-5.
Avishai Wool lecture Introduction to Systems Programming Lecture 4 Inter-Process / Inter-Thread Communication.
1 CS 333 Introduction to Operating Systems Class 3 – Threads & Concurrency Jonathan Walpole Computer Science Portland State University.
CS533 - Concepts of Operating Systems 1 Anyone NOT on this list see me after class! Bock, Tony Carroll, Diana Kulkarni, Ashwini LeVitre, Jon Mukherjee,
1 CS 333 Introduction to Operating Systems Class 5 – Semaphores and Classical Synchronization Problems Jonathan Walpole Computer Science Portland State.
Synchronization Principles. Race Conditions Race Conditions: An Example spooler directory out in 4 7 somefile.txt list.c scores.txt Process.
1 CS 333 Introduction to Operating Systems Class 4 – Synchronization Primitives Semaphores Jonathan Walpole Computer Science Portland State University.
CS533 Concepts of Operating Systems Class 1
Jonathan Walpole Computer Science Portland State University
Jonathan Walpole Computer Science Portland State University
Jonathan Walpole Computer Science Portland State University
CPS110: Implementing threads/locks on a uni-processor Landon Cox.
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
1 Outline Processes Threads Inter-process communication (IPC) Classical IPC problems Scheduling.
Jonathan Walpole Computer Science Portland State University
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
1 CS 333 Introduction to Operating Systems Class 5 – Semaphores and Classical Synchronization Problems Jonathan Walpole Computer Science Portland State.
1 Race Conditions/Mutual Exclusion Segment of code of a process where a shared resource is accessed (changing global variables, writing files etc) is called.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
1 Processes and Threads Chapter Processes 2.2 Threads 2.3 Interprocess communication 2.4 Classical IPC problems 2.5 Scheduling.
CS510 Concurrent Systems Introduction to Concurrency.
CS533 Concepts of Operating Systems Jonathan Walpole.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 2 Processes and Threads Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Lecture 3 Process Concepts. What is a Process? A process is the dynamic execution context of an executing program. Several processes may run concurrently,
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Copyright ©: University of Illinois CS 241 Staff1 Threads Systems Concepts.
CS333 Intro to Operating Systems Jonathan Walpole.
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
CS533 - Concepts of Operating Systems 1 Anyone NOT on this list see me after class! Arryadi, Rizal Carlson, Kristen Ellet, Burke Florey, David Greenwald,
Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms to ensure the orderly execution.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Operating Systems CSE 411 CPU Management Dec Lecture Instructor: Bhuvan Urgaonkar.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
Copyright ©: Nahrstedt, Angrave, Abdelzaher
CS533 Concepts of Operating Systems Jonathan Walpole.
Synchronization CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Operating System Concepts and Techniques Lecture 13 Interprocess communication-2 M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and.
CS533 Concepts of Operating Systems Class 2 Overview of Threads and Concurrency.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
Implementing Mutual Exclusion Andy Wang Operating Systems COP 4610 / CGS 5765.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Chapter 6 Synchronization Dr. Yingwu Zhu. The Problem with Concurrent Execution Concurrent processes (& threads) often access shared data and resources.
7/9/ Realizing Concurrency using Posix Threads (pthreads) B. Ramamurthy.
CS703 – Advanced Operating Systems
Background on the need for Synchronization
CS399 New Beginnings Jonathan Walpole.
Chapter 5: Process Synchronization
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Jonathan Walpole Computer Science Portland State University
Jonathan Walpole Computer Science Portland State University
CS510 Operating System Foundations
Jonathan Walpole Computer Science Portland State University
CS399 New Beginnings Jonathan Walpole.
Jonathan Walpole Computer Science Portland State University
CS510 Operating System Foundations
CS510 Operating System Foundations
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
Jonathan Walpole Computer Science Portland State University
Presentation transcript:

1 OMSE 510: Computing Foundations 6: Multithreading Chris Gilmore Portland State University/OMSE Material Borrowed from Jon Walpole’s lectures

2 Today Threads Critical Sections Mutual Exclusion

3 Threads Processes have the following components: an address space a collection of operating system state a CPU context … or thread of control On multiprocessor systems, with several CPUs, it would make sense for a process to have several CPU contexts (threads of control) Multiple threads of control could run in the same address space on a single CPU system too! “thread of control” and “address space” are orthogonal concepts

4 Threads Threads share a process address space with zero or more other threads Threads have their own PC, SP, register state etc stack What other OS state should be private to threads? A traditional process can be viewed as an address space with a single thread

5 Single thread state within a process

6 Multiple threads in an address space

7 What is a thread? A thread executes a stream of instructions an abstraction for control-flow Practically, it is a processor context and stack Allocated a CPU by a scheduler Executes in the context of a memory address space

8 Summary of private per- thread state Stack (local variables) Stack pointer Registers Scheduling properties (i.e., priority) Set of pending and blocked signals Other thread specific data

9 Shared state among threads Open files, sockets, locks User ID, group ID, process/task ID Address space Text Data (off-stack global variables) Heap (dynamic data) Changes made to shared state by one thread will be visible to the others Reading and writing memory locations requires synchronization! … a major topic for later …

10 Independent execution of threads Each thread has its own stack

11 How do you program using threads? Split program into routines to execute in parallel True or pseudo (interleaved) parallelism

12 Why program using threads? Utilize multiple CPU’s concurrently Low cost communication via shared memory Overlap computation and blocking on a single CPU Blocking due to I/O Computation and communication Handle asynchronous events

13 Thread usage A word processor with three threads

14 Processes versus threads - example GET / HTTP/1.0 HTTPD disk A WWW process

15 Processes versus threads - example GET / HTTP/1.0 HTTPD disk Why is this not a good web server design? A WWW process

16 HTTPD Processes versus threads - example GET / HTTP/1.0 HTTPD disk A WWW process

17 Processes versus threads - example GET / HTTP/1.0 HTTPD disk GET / HTTP/1.0 A WWW process

18 Processes versus threads - example GET / HTTP/1.0 HTTPD disk GET / HTTP/1.0 A WWW process

19 Threads in a web server A multithreaded web server

20 Thread usage Rough outline of code for previous slide (a) Dispatcher thread (b) Worker thread

21 System structuring options Three ways to construct a server

22 Common thread programming models Manager/worker Manager thread handles I/O and assigns work to worker threads Worker threads may be created dynamically, or allocated from a thread-pool Peer Like manager worker, but manager participates in the work Pipeline Each thread handles a different stage of an assembly line Threads hand work off to each other in a producer-consumer relationship

23 What does a typical thread API look like? POSIX standard threads (Pthreads) First thread exists in main(), typically creates the others pthread_create (thread,attr,start_routine,arg) Returns new thread ID in “thread” Executes routine specified by “start_routine” with argument specified by “arg” Exits on return from routine or when told explicitly

24 Thread API (continued) pthread_exit (status) Terminates the thread and returns “status” to any joining thread pthread_join (threadid,status) Blocks the calling thread until thread specified by “threadid” terminates Return status from pthread_exit is passed in “status” One way of synchronizing between threads pthread_yield () Thread gives up the CPU and enters the run queue

25 Using create, join and exit primitives

26 An example Pthreads program Program Output Creating thread 0 Creating thread 1 0: Hello World! 1: Hello World! Creating thread 2 Creating thread 3 2: Hello World! 3: Hello World! Creating thread 4 4: Hello World! For more examples see: #include #define NUM_THREADS 5 void *PrintHello(void *threadid) { printf("\n%d: Hello World!\n", threadid); pthread_exit(NULL); } int main (int argc, char *argv[]) { pthread_t threads[NUM_THREADS]; int rc, t; for(t=0; t<NUM_THREADS; t++ { printf("Creating thread %d\n", t); rc = pthread_create(&threads[t], NULL, PrintHello, (void *)t); if (rc) { printf("ERROR; return code from pthread_create() is %d\n", rc); exit(-1); } pthread_exit(NULL); }

27 Pros & cons of threads Pros Overlap I/O with computation! Cheaper context switches Better mapping to shared memory multiprocessors Cons Potential thread interactions Complexity of debugging Complexity of multi-threaded programming Backwards compatibility with existing code

28 Making single-threaded code multithreaded Conflicts between threads over the use of a global variable

29 Making single-threaded code multithreaded Threads can have private global variables

30 User-level threads Threads can be implemented in the OS or at user level User level thread implementations thread scheduler runs as user code manages thread contexts in user space OS sees only a traditional process

31 Kernel-level threads The thread-switching code is in the kernel

32 User-level threads package The thread-switching code is in user space

33 User-level threads Advantages cheap context switch costs! User-programmable scheduling policy Disadvantages How to deal with blocking system calls! How to overlap I/O and computation!

34 Hybrid thread implementations Multiplexing user-level threads onto kernel- level threads

35 Scheduler activations Goal – mimic functionality of kernel threads gain performance of user space threads The idea - kernel upcalls to user-level thread scheduling code when it handles a blocking system call or page fault user level thread scheduler can choose to run a different thread rather than blocking kernel upcalls when system call or page fault returns Kernel assigns virtual processors to each process (which contains a user level thread scheduler) lets user level thread scheduler allocate threads to processors Problem: relies on kernel (lower layer) calling procedures in user space (higher layer)

36 Concurrent programming Assumptions: Two or more threads (or processes) Each executes in (pseudo) parallel and can’t predict exact running speeds The threads can interact via access to a shared variable Example: One thread writes a variable The other thread reads from the same variable Problem: The order of READs and WRITEs can make a difference!!!

37 Race conditions What is a race condition? two or more processes have an inconsistent view of a shared memory region (I.e., a variable) Why do race conditions occur? values of memory locations replicated in registers during execution context switches at arbitrary times during execution processes can see “stale” memory values in registers

38 Counter increment race condition Incrementing a counter (load, increment, store) Context switch can occur after load and before increment!

39 Race Conditions Race condition: whenever the output depends on the precise execution order of the processes!!! What solutions can we apply? prevent context switches by preventing interrupts make threads coordinate with each other to ensure mutual exclusion in accessing critical sections of code

40 Mutual exclusion conditions No two processes simultaneously in critical section No assumptions made about speeds or numbers of CPUs No process running outside its critical section may block another process No process must wait forever to enter its critical section

41 Critical sections with mutual exclusion

42 How can we enforce mutual exclusion? What about using a binary “lock” variable in memory and having threads check it and set it before entry to critical regions? Solves the problem of exclusive access to shared data. Expresses intention to enter critical section Acquiring a lock prevents concurrent access Assumption: Every threads sets lock before accessing shared data! Every threads releases the lock after it is done!

43 Acquiring and releasing locks Free Lock Thread A Thread D Thread C Thread B

44 Acquiring and releasing locks Free Lock Thread A Thread D Thread C Thread B Lock

45 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

46 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

47 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B

48 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

49 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

50 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

51 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

52 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock Unlock

53 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock Unlock

54 Acquiring and releasing locks Free Lock Thread A Thread D Thread C Thread B Lock

55 Acquiring and releasing locks Free Lock Thread A Thread D Thread C Thread B Lock

56 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

57 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

58 Acquiring and releasing locks Set Lock Thread A Thread D Thread C Thread B Lock

59 Mutex locks An abstract data type Used for synchronization and mutual exclusion The “mutex” is either: Locked (“the lock is held”) Unlocked(“the lock is free”)

60 Mutex lock operations Lock (mutex) Acquire the lock, if it is free If the lock is not free, then wait until it can be acquired Unlock (mutex) Release the lock If there are waiting threads, then wake up one of them Both Lock and Unlock are assumed to be atomic!!! A kernel implementation can ensure atomicity

61 An Example using a Mutex 1 repeat 2 Lock(myLock); 3 critical section 4 Unlock(myLock); 5 remainder section 6 until FALSE 1 repeat 2 Lock(myLock); 3 critical section 4 Unlock(myLock); 5 remainder section 6 until FALSE Shared data: Mutex myLock;

62 But how can we implement a mutex lock? Does a binary “lock” variable in memory work? Many computers have some limited hardware support for setting locks “Atomic” Test and Set Lock instruction “Atomic” compare and swap operation Can be used to implement “Mutex” locks

63 Test-and-set-lock instruction (TSL, tset) A lock is a single word variable with two values 0 = FALSE = not locked 1 = TRUE = locked Test-and-set does the following atomically: Get the (old) value Set the lock to TRUE Return the old value If the returned value was FALSE... Then you got the lock!!! If the returned value was TRUE... Then someone else has the lock (so try again later)

64 Test and set lock FALSE Lock

65 Test and set lock P1 FALSE Lock

66 Test and set lock P1 Lock FALSE FALSE = Lock Available!!

67 Test and set lock TRUE Lock FALSE P1

68 Test and set lock TRUE Lock FALSE P1

69 Test and set lock TRUE Lock P1P2 P3 P4 TRUE

70 Test and set lock TRUE Lock P1P2 P3 P4 TRUE

71 Test and set lock TRUE Lock P1P2 P3 P4 TRUE

72 Test and set lock FALSE Lock P1P2 P3 P4 TRUE

73 Test and set lock FALSE Lock P1P2 P3 P4 TRUE FALSE

74 Test and set lock FALSE Lock P1P2 P3 P4 TRUE FALSE

75 Test and set lock TRUE Lock P1P2 P3 P4 TRUE

76 Critical section entry code with TSL 1 repeat 2 while(TSL(lock)) 3 no-op; 4 critical section 5 Lock = FALSE; 6 remainder section 7 until FALSE 1 repeat 2while(TSL(lock)) 3 no-op; 4 critical section 5 Lock = FALSE; 6 remainder section 7 until FALSE JI Guarantees that only one thread at a time will enter its critical section Note that processes are busy while waiting Spin locks

77 Busy waiting Also called polling or spinning The thread consumes CPU cycles to evaluate when lock becomes free!!! Shortcoming on a single CPU system... A busy-waiting thread can prevent the lock holder from running & completing its critical section & releasing the lock! Better: Block instead of busy wait!

78 Quiz What is the difference between a program and a process? Is the Operating System a program? Is the Operating System a process? What is the difference between processes and threads? What tasks are involved in switching the CPU from one process to another? Why is it called a context switch? What tasks are involved in switching the CPU from one thread to another? Why are threads “lightweight”?

79 Synchronization primitives Sleep Put a thread to sleep Thread becomes BLOCKed Wakeup Move a BLOCKed thread back onto “Ready List” Thread becomes READY (or RUNNING) Yield Move to another thread Does not BLOCK thread Just gives up the current time-slice

80 But how can these be implemented? In User Programs: System calls to the kernel In Kernel: Calls to the thread scheduler routines

81 Concurrency control in the kernel Different threads call Yield, Sleep,... Scheduler routines manipulate the “ready list” The ready list is shared data ! Problem: How can scheduler routines be programmed correctly? Solution: Scheduler can disable interrupts, or Scheduler can use the TSL instruction

82 Concurrency in the kernel The kernel can avoid performing context switches while manipulating the ready list prevents concurrent execution of system call code … but what about interrupts? … what if interrupt handlers touch the ready list? Disabling interrupts during critical sections Ensures that interrupt handling code will not run Using TSL for critical sections Ensures mutual exclusion for all code that follows that convention

83 Disabling interrupts Disabling interrupts in the OS vs disabling interrupts in user processes why not allow user processes to disable interrupts? is it ok to disable interrupts in the OS? what precautions should you take?

84 Disabling interrupts in the kernel Scenario 1: A thread is running; wants to access shared data Disable interrupts Access shared data (“critical section”) Enable interrupts

85 Disabling interrupts in the kernel Scenario 2: Interrupts are already disabled and a second thread wants to access the critical section...using the above sequence...

86 Disabling interrupts in the kernel Scenario 2: Interrupts are already disabled. Thread wants to access critical section using the previous sequence... Save previous interrupt status (enabled/disabled) Disable interrupts Access shared data (“critical section”) Restore interrupt status to what it was before

87 Classical Synchronization Problems Producer-Consumer One thread produces data items Another thread consumes them Use a bounded buffer / queue between the threads The buffer is a shared resource Must control access to it!!! Must suspend the producer thread if buffer is full Must suspend the consumer thread if buffer is empty

88 Producer/Consumer with Busy Waiting thread producer { while(1){ // Produce char c while (count==n) { no_op } buf[InP] = c InP = InP + 1 mod n count++ } thread consumer { while(1){ while (count==0) { no_op } c = buf[OutP] OutP = OutP + 1 mod n count-- // Consume char } n-1 … Global variables: char buf[n] int InP = 0 // place to add int OutP = 0 // place to get int count

89 Problems with this code Count variable can be corrupted if context switch occurs at the wrong time A race condition exists! Race bugs very difficult to track down What if buffer is full? Produce will busy-wait Consumer will not be able to empty the buffer What if buffer is empty? Consumer will busy-wait Producer will not be able to fill the buffer

90 0 thread consumer { 1 while(1) { 2 while (count==0) { 3 sleep(empty) 4 } 5 c = buf[OutP] 6 OutP = OutP + 1 mod n 7 count--; 8 if (count == n-1) 9 wakeup(full) 10 // Consume char 11 } 12 } Producer/Consumer with Blocking 0 thread producer { 1 while(1) { 2 // Produce char c 3 if (count==n) { 4 sleep(full) 5 } 6 buf[InP] = c; 7 InP = InP + 1 mod n 8 count++ 9 if (count == 1) 10 wakeup(empty) 11 } 12 } n-1 … Global variables: char buf[n] int InP = 0 // place to add int OutP = 0 // place to get int count

91 This code is still incorrect! The “count” variable can be corrupted: Increments or decrements may be lost! Possible Consequences: Both threads may sleep forever Buffer contents may be over-written What is this problem called?

92 This code is still incorrect! The “count” variable can be corrupted: Increments or decrements may be lost! Possible Consequences: Both threads may sleep forever Buffer contents may be over-written What is this problem called? Race Condition Code that manipulates count must be made into a ??? and protected using ???

93 This code is still incorrect! The “count” variable can be corrupted: Increments or decrements may be lost! Possible Consequences: Both threads may sleep forever Buffer contents may be over-written What is this problem called? Race Condition Code that manipulates count must be made into a critical section and protected using mutual exclusion!

94 Semaphores An abstract data type that can be used for condition synchronization and mutual exclusion What is the difference between mutual exclusion and condition synchronization?

95 Semaphores An abstract data type that can be used for condition synchronization and mutual exclusion Condition synchronization wait until invariant holds before proceeding signal when invariant holds so others may proceed Mutual exclusion only one at a time in a critical section

96 Semaphores An abstract data type containing an integer variable (S) Two operations: Down (S) and Up (S) Alternative names for the two operations Down(S) = Wait(S) = P(S) Up(S) = Signal(S) = V(S)

97 Semaphores Down (S) … also called called “Wait” decrement S by 1 if S would go negative, wait/sleep until signaled Up (S) … also called “Signal” increment S by 1 signal/wakeup a waiting thread S will always be >= 0. Both Up () and Down () are assumed to be atomic!!! A kernel implementation must ensure atomicity

98 Variation: Binary Semaphores Counting Semaphores same as just “semaphore” Binary Semaphores a specialized use of semaphores the semaphore is used to implement a Mutex Lock

99 Variation: Binary Semaphores Counting Semaphores same as just “semaphore” Binary Semaphores a specialized use of semaphores the semaphore is used to implement a Mutex Lock the count will always be either 0 = locked 1 = unlocked

100 Using Semaphores for Mutex 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE semaphore mutex = 1-- unlocked Thread A Thread B

101 Using Semaphores for Mutex semaphore mutex = 0-- locked 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

102 Using Semaphores for Mutex semaphore mutex = 0--locked 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

103 Using Semaphores for Mutex semaphore mutex = 0-- locked 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

104 Using Semaphores for Mutex semaphore mutex = 0-- locked 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

105 Using Semaphores for Mutex semaphore mutex = 1-- unlocked This thread can now be released! 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

106 Using Semaphores for Mutex semaphore mutex = 0-- locked 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE 1 repeat 2 down(mutex); 3 critical section 4 up(mutex); 5 remainder section 6 until FALSE Thread A Thread B

107 Exercise: Implement producer/consumer Global variables semaphore full_buffs = ?; semaphore empty_buffs = ?; char buff[n]; int InP, OutP; 0 thread producer { 1 while(1){ 2 // Produce char c... 3 buf[InP] = c 4 InP = InP + 1 mod n 5 } 6 } 0 thread consumer { 1 while(1){ 2 c = buf[OutP] 3 OutP = OutP + 1 mod n 4 // Consume char... 5 } 6 }

108 Counting semaphores in producer/consumer 0 thread producer { 1 while(1){ 2 // Produce char c... 3 down(empty_buffs) 4 buf[InP] = c 5 InP = InP + 1 mod n 6 up(full_buffs) 7 } 8 } 0 thread consumer { 1 while(1){ 2 down(full_buffs) 3 c = buf[OutP] 4 OutP = OutP + 1 mod n 5 up(empty_buffs) 6 // Consume char... 7 } 8 } Global variables semaphore full_buffs = 0; semaphore empty_buffs = n; char buff[n]; int InP, OutP;

109 Implementing semaphores Up() and Down() are assumed to be atomic How can we ensure that they are atomic? Implement Up() and Down() as system calls? how can the kernel ensure Up() and Down() are completed atomically? avoid scheduling another thread when they are in progress? … but how exactly would you do that? … and what about semaphores for use in the kernel?

110 Semaphores with interrupt disabling Up(semaphore sem) DISABLE_INTS sem.val++ if (sem.val <= 0) { th = remove next thread from sem.L wakeup(th) } ENABLE_INTS struct semaphore { int val; list L; } Down(semaphore sem) DISABLE_INTS sem.val-- if (sem.val < 0){ add thread to sem.L block(thread) } ENABLE_INTS

111 Semaphores with interrupt disabling Up(semaphore sem) DISABLE_INTS sem.val++ if (sem.val <= 0) { th = remove next thread from sem.L wakeup(th) } ENABLE_INTS struct semaphore { int val; list L; } Down(semaphore sem) DISABLE_INTS sem.val-- if (sem.val < 0){ add thread to sem.L block(thread) } ENABLE_INTS

112 But what are block() and wakeup()? If block stops a thread from executing, how, where, and when does it return? which thread enables interrupts following Down()? the thread that called block() shouldn’t return until another thread has called wakeup() ! … but how does that other thread get to run? … where exactly does the thread switch occur? Scheduler routines such as block() contain calls to switch() which is called in one thread but returns in a different one!!

113 Semaphores using atomic instructions As we saw earlier, hardware provides special atomic instructions for synchronization test and set lock (TSL) compare and swap (CAS) etc Semaphore can be built using atomic instructions 1. build mutex locks from atomic instructions 2. build semaphores from mutex locks

114 Building blocking mutex locks using TSL Mutex_lock: TSL REGISTER,MUTEX | copy mutex to register and set mutex to 1 CMP REGISTER,#0 | was mutex zero? JZE ok | if it was zero, mutex is unlocked, so return CALL thread_yield | mutex is busy, so schedule another thread JMP mutex_lock | try again later Ok:RET | return to caller; enter critical section Mutex_unlock: MOVE MUTEX,#0 | store a 0 in mutex RET | return to caller

115 Building spinning mutex locks using TSL Mutex_lock: TSL REGISTER,MUTEX | copy mutex to register and set mutex to 1 CMP REGISTER,#0 | was mutex zero? JZE ok | if it was zero, mutex is unlocked, so return CALL thread_yield | mutex is busy, so schedule another thread JMP mutex_lock | try again later Ok:RET | return to caller; enter critical section Mutex_unlock: MOVE MUTEX,#0 | store a 0 in mutex RET | return to caller

116 To block or not to block? Spin-locks do busy waiting wastes CPU cycles on uni-processors Why? Blocking locks put the thread to sleep may waste CPU cycles on multi- processors Why?

117 Building semaphores using mutex locks Problem: Implement a counting semaphore Up () Down ()...using just Mutex locks

118 How about two “blocking” mutex locks? var cnt: int = 0 -- Signal count var m1: Mutex = unlocked -- Protects access to “cnt” m2: Mutex = locked -- Locked when waiting Down (): Lock(m1) cnt = cnt – 1 if cnt<0 Unlock(m1) Lock(m2) else Unlock(m1) endIf Up(): Lock(m1) cnt = cnt + 1 if cnt<=0 Unlock(m2) endIf Unlock(m1)

119 How about two “blocking” mutex locks? var cnt: int = 0 -- Signal count var m1: Mutex = unlocked -- Protects access to “cnt” m2: Mutex = locked -- Locked when waiting Down (): Lock(m1) cnt = cnt – 1 if cnt<0 Unlock(m1) Lock(m2) else Unlock(m1) endIf Up(): Lock(m1) cnt = cnt + 1 if cnt<=0 Unlock(m2) endIf Unlock(m1) Contains a Race Condition!

120 Oops! How about this then? var cnt: int = 0 -- Signal count var m1: Mutex = unlocked -- Protects access to “cnt” m2: Mutex = locked -- Locked when waiting Down (): Lock(m1) cnt = cnt – 1 if cnt<0 Lock(m2) Unlock(m1) else Unlock(m1) endIf Up(): Lock(m1) cnt = cnt + 1 if cnt<=0 Unlock(m2) endIf Unlock(m1)

121 Oops! How about this then? var cnt: int = 0 -- Signal count var m1: Mutex = unlocked -- Protects access to “cnt” m2: Mutex = locked -- Locked when waiting Down (): Lock(m1) cnt = cnt – 1 if cnt<0 Lock(m2) Unlock(m1) else Unlock(m1) endIf Up(): Lock(m1) cnt = cnt + 1 if cnt<=0 Unlock(m2) endIf Unlock(m1) Contains a Deadlock!

122 Ok! Lets have another try! var cnt: int = 0 -- Signal count var m1: Mutex = unlocked -- Protects access to “cnt” m2: Mutex = locked -- Locked when waiting Down (): Lock(m2) Lock(m1) cnt = cnt – 1 if cnt>0 Unlock(m2) endIf Unlock(m1) … is this solution valid? Up(): Lock(m1) cnt = cnt + 1 if cnt=1 Unlock(m2) endIf Unlock(m1)

123 What about this solution? Mutex m1, m2; // binary semaphores int C = N; // N is # locks int W = 0; // W is # wakeups Down(): Lock(m1); C = C – 1; if (C<0) Unlock(m1); Lock(m2); Lock(m1); W = W – 1; if (W>0) Unlock(m2); endif; else Unlock(m1); endif; Up(): Lock(m1); C = C + 1; if (C<=0) W = W + 1; Unlock(m2); endif; Unlock(m1);

124 Implementation possibilities Can also implement using Test-And-Set Calls to Sleep, Wake-Up Implement Mutex Locks... using Semaphores Implement Counting Semaphores... using Binary Semaphores... using Mutex Locks Implement Binary Semaphores... etc