11 Chapter 2: Process/Thread Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒
22 Content Processes Threads Inter-process communication Classical IPC problems Scheduling
33 Definition of A Process Informal –A program in execution –A running piece of code along with all the things the program can read/write Formal –One or more threads in their own address space Note that process != program
44 The Need for Process What is the principle motivation for inventing process? –To support multiprogramming
55 The Process Model Conceptual viewing of the processes Concurrency –Multiple processes seem to run concurrently –But in reality only one active at any instant Progress –Every process makes progress
66 Programs in memoryConceptual viewTime line view Multiprogramming of 4 programs
77 Process Creation Principal events that cause process creation 1. System initialization 2. Execution of a process creation system 3. User request to create a new process
88 Process Termination Conditions which terminate processes 1. Normal exit (voluntary) 2. Error exit (voluntary) 3. Fatal error (involuntary) 4. Killed by another process (involuntary)
99 Process Hierarchies Parent creates a child process Child processes can create its own process Processes creation forms a hierarchy –UNIX calls this a "process group" –Windows has no concept of such hierarchy, i.e. all processes are created equal
10 Process States Running BlockedReady Possible process states –running, blocked, ready Process blocks for input Scheduler picks another process Input becomes available Scheduler picks this process
11 Process Space Also called Address Space All the data the process uses as it runs Passive (acted upon by the process) Play analogy: –all the objects on the stage in a play
12 Process Space Is the unit of state partitioning –Each process occupies a different state of the computer Main topic: –How multiple processes spaces can share a single physical memory efficiently and safely
13 Manage Process & Space Who manages process & space? –The operating systems How does OS achieve it? –By maintain information about processes –i.e. use process tables
14 Fields of A Process Table Registers, Program counter, Status word Stack pointer, Priority, Process ID Parent group, Process group, Signals Time when process started CPU time used, Children’s CPU time, etc.
15 Problems with Process While supporting multiprogramming on shared hardware Itself is single threaded! –i.e. a process can do only one thing at a time –blocking call renders entire process unrunnable
16 Threads Invented to support multiprogramming on process level Manage OS complexity –Multiple users, programs, I/O devices, etc. Each thread dedicated to do one task
17 Thread Sequence of executing instructions from a program –i.e. the running computation Play analogy: one actor on stage in a play
18 Threads Processes decompose mix of activities into several parallel tasks (columns) Each job can work independently of others job1job2job3 Thread 1 Thread 2 Thread 3
19 The Thread Model 3 processes each with 1 thread 1 process with 3 threads Kernel Thread Proc 1Proc 2Proc 3 User space Kernel space Kernel Thread Process
20 The Thread Model Some items shared by all threads in a process Some items private to each thread
21 Shared and Private Items Per process itemsPer thread items Address spaceProgram counter Global variablesRegisters Open filesStack Child processesState Pending alarms Signals and signal handlers Accounting information
22 Shared and Private Items Each thread has its own stack
23 Input thread Backup thread Display thread A Word Processor w/3 Threads
24 Implementation of Thread How many options to implement thread? Implement in kernel space Implement in user space Hybrid implementation
25 Kernel-Level Implementation Completely implemented in kernel space OS acts as a scheduler OS maintains information about threads –In additions to processes
26 Kernel-Level Implementation
27 Kernel-Level Implementation Advantage: –Easier to programming –Blocking threads does not block process Problems: –Costly: need to trap into OS to switch threads –OS space is limited (for maintaining info) –Need to modifying OS!
28 User-Level Implementation Completely implemented in user space A run-time system acts as a scheduler Threads voluntarily cooperate –i.e. yield control to other threads OS need not know existence of threads
29 User-Level Implementation
30 User-Level Implementation Advantage: –Flexible, can be implemented on any OS –Faster: no need to trap into OS Problems: –Programming is tricky –blocking threads block process!
31 User-Level Implementation How do we solve the problem of blocking thread blocks the whole process? Modifying system calls to be unblocking Write a wrap around blocking calls –i.e. call only when it is safe to do so
32 Scheduler Activations A technique that solves the problem of blocking calls in user-level threads Method: use up-call Goal – mimic functionality of kernel threads –gain performance of user space threads
33 Scheduler Activations Kernel assigns virtual processors to process Runtime sys. allocate threads to processors Blocking threads are handled by OS upcall –i.e. OS notify runtime system about blocking calls
34 Scheduler Activations Problem: Reliance on kernel (lower layer) calling procedures in user space (higher layer) Violates layered structure of OS design
35 Hybrid Implementation Can we have the best of both worlds –i.e. kernel-level and user-level implementations While avoiding the problems of either? Hybrid implementation
36 Hybrid Implementation User-level threads are managed by runtime systems Kernel-level threads are managed by OS Multiplexing user-level threads onto kernel- level threads
37 Hybrid Implementation
38 Multiple Threads Can have several threads in a single address space –That is what thread is invented for Play analogy: several actors on a single set –Sometimes interact (e.g. dance together) –Sometimes do independent tasks
39 Multiple Threads Private state for a thread vs. global state shared between threads –What private state must a thread have? –Other state is shared between all threads in a process
40 Multiple Threads Per process itemsPer thread items Address spaceProgram counter Global variablesRegisters Open filesStack Child processesState Pending alarms Signals and signal handlers Accounting information
41 Multiple Threads Many programs are written in single-threaded processes Make them multithreaded is very tricky –Can cause unexpected problems
42 Multiple Threads Conflicts between threads over the use of a global variable
43 Multiple Threads Many solutions: Prohibit global variables Assign each thread its private global variables
44 Multiple Threads Threads can have private global variables
45 Cooperating Threads Often we create threads to cooperate Each thread handles one request Each thread can issue a blocking disk I/O, –wait for I/O to finish –then continue with next part of its request
46 Cooperating Threads Even though thread blocks, other threads can make progress and new threads can start to handle new requests
47 Cooperating Threads Issues with cooperating threads? Where is the state of each request stored? –In process space shared by all threads? –In private space of the thread? How to communicate with each other?
48 Cooperating Threads Ordering of events from different threads is non- deterministic –e.g. after 10 seconds, different threads may have gotten differing amounts of work done thread A > thread B > thread C >
49 Cooperating Threads Example –thread A: x=1 –thread B: x=2 Possible results? Is 3 a possible output? –yes
50 Cooperating Threads 3 is possible because assignment is not atomic and threads can interleave if assignment to x is atomic –then only possible results are 1 and 2
51 Atomic: indivisible –Either happens in its entirety without interruption or has yet to happen at all No events from other threads can happen in between start & end of an atomic event Atomic Operations
52 Atomic Operations On most machines, memory load & store are atomic But many instructions are not atomic –double-precision floating on a 32-bit machine –two separate memory operations
53 Atomic Operations If you don’t have any atomic operations –you can’t make one Fortunately, the hardware folks give us atomic operations –and we can build up higher-level atomic primitives from there
54 Atomic Operations Another example: –thread A thread B –i=0 i=0 –while (i -10) { – i++ i-- –} } –print “A finished” print “B finished” Who will win?
55 Atomic Operations Is it guaranteed that someone will win? What if threads run at exactly the same speed and start close together? Is it guaranteed that it goes on forever?
56 Atomic Operations Arithmetic example –(initially y=10) –thread A: x = y + 1; –thread B: y = y * 2; Possible results?
57 Thread Synchronization Must control the inter-leavings between threads –Or the results can be non-deterministic All possible inter-leavings must yield a correct answer
58 Thread Synchronization Try to constrain the thread executions as little as possible Controlling the execution and order of threads is called “synchronization”
59 Gold Fish Problem Problem definition: Tracy and Peter want to keep a gold fish alive –By feeding the fish properly if either sees that the fish is not fed, –she/he goes to feed the fish The fish must be fed once and only once each day
60 Correctness Properties Someone will feed the fish if needed But never more than one person feed the fish
61 Solution #0 No synchronization –Peter: Tracy: –if (noFeed) { if (noFeed) { – feed fish feed fish –} }
62 Problem with Solution #0 Peter Tracy 3:00 look at the fish (no feed) 3:05 look at the fish (no feed) 3:10 feed fish 3:25 feed fish Fish over eat!
63 Problem with Solution #0 Over eat occurred because: –They execute the same code at the same time –i.e. if (noFeed) feed fish This is called race –Two or more threads try to access the same shared resource at the same time
64 Critical Section The shared resource or code is called a critical section –the region where race condition occurs Access to it must be coordinated –i.e. only once process can access it at a time
65 Critical Section In solution #0, critical section is –“if (noFeed) feed fish” Peter and Tracy must NOT be inside the critical section at the same time
66 1st Type of Synchronization Ensure only one person in critical section –i.e. only 1 person goes shopping at a time This is called mutual exclusion: –only 1 thread is doing a certain thing at one time –others are excluded
67 Mutual Exclusion 4 conditions for mutual exclusion No 2 processes simultaneously in critical region No assumptions about speeds or #s of CPUs No process running outside its critical region may block another process No process wait forever to enter critical region
68 Mutual Exclusion while(TRUE) { while(turn!=0) /*loop */while(turn!=1) /*loop*/critical_region(); turn=1;turn=0;noncritical_region();} Process 0Process 1
69 Peterson ’ s Solution #define FALSE 0 #define TRUE 1 #define N2/*# of processes*/ int turn;/*whose turn is it*/ int interested[N};/*all values initially 0 (FALSE)*/
70 Peterson ’ s Solution Void enter_region(int process)/*process is 0 or 1*/ { int other;/*# of the other process*/ other =1 –process;/*the opposite of process*/ interested[process]=TRUE; /*show you are interested*/ turn=process;/*Set flag*/ while(turn==process && interested[other]==TRUE) /*null statement*/ }
71 Peterson ’ s Solution void leave_region(int process) /*process who is leaving*/ { interested[process]=FALSE; /*indicate departure from critical region*/ }
72 Assembly Implementation Enter_region: TSL_REGISTER, LOCK |copy lock to register and set lock to 1 CMP REGISTER, #0| was lock zero? JNE enter_region|if nonzero, lock was set, so loop RET|return to caller; CR entered Leave_region: MOVE LOCK, #0|store a 0 in lock RET|return to caller
73 Gold Fish Solution #1 Idea: –leave note that you’re going to check on the fish status, so other person doesn’t also feed
74 Solution #1 Peter: Tracy: if (noNote) { leave note leave note if (noFeed) { if (noFeed) { feed fish feed fish } } remove note remove note }
75 Solution #1 Does this work? If not, when could it fail? Is solution #1 better than solution #0?
76 Solution #2 Idea: change the order of “leave note” and “check note” This requires labeled notes –otherwise you’ll see your own note –and think it was the other person’s note
77 Solution #2 Peter: Tracy: leave notePeter leave noteTracy if (no noteTracy) { if (no notePeter) { if (noFeed) { if (noFeed) { feed fish feed fish } } } remove notePeter remove noteTracy
78 Solution #2 Does this work? –Yes it solves the over eat problem If not, when could it fail? –But it introduces starvation problem
79 Solution #3 Idea: –have a way to decide who will feed fish when both leave notes at the same time. Approach: –Have Peter hang around to make sure job is done
80 Solution #3 Peter: Tracy: leave notePeter leave noteTracy while (noteTracy) { do nothing } if (no notePeter) { if (noFeed) { feed fish feed fish } } remove notePeter remove noteTracy
81 Solution #3 Peter’s “while (noteTracy)” prevents him from running his critical section at same time as Tracy’s It also prevents Peter from running off without make sure that someone feeds fish
82 Proof of Correctness (Tracy) if no notePeter, then it’s safe to feed because Peter hasn’t started yet. –Peter will wait for Tracy to be done before checking fish status
83 Proof of Correctness (Tracy) if notePeter, then Peter is in the body of the code and will eventually feed the fish (if needed) –Note: Peter may be waiting for Tracy to quit
84 Proof of Correctness (Peter) if no noteTracy, it’s safe to feed –because Peter has already left notePeter, and Tracy will check notePeter in the future (Peter) if noteTracy, Peter hangs around and waits to see if Tracy feeds fish. –If Tracy feeds, we’re done –If Tracy doesn’t feed, Peter will feed
85 Solution #3 Correct, but ugly –Complicated (non-intuitive) to prove correct Asymmetric –peter and Tracy runs different code –This makes coding difficult
86 Solution #3 Wasteful –Peter consumes CPU time while waiting for Tracy to remove note Constantly checking some status while waiting on something is called busy-waiting –Very very bad
87 Higher-Level Synchronization What is the solution? –Raise the level of abstraction Make life easier for programmers
88 Higher-Level synchronization Low-level atomic operations provide by hardware (i.e. load/store, interrupt enable/disable, test & set) High level synchronization operations provided by software (i.e. semaphore, lock, monitor) Concurrent programs
89 Lock (Mutex) A Lock prevents another thread from entering a critical section –e.g. lock the fridge while you’re shopping so that both Peter and Tracy don’t go shopping
90 Lock (Mutex) Two operations –lock(): wait until lock is free, then acquire it –do { – if (lock is free) { – acquire lock; break – } –} while (1) unlock(): release lock
91 Lock (Mutex) Why was the “note” in Gold Fishsolutions #1 and #2 not a good lock? Does it meet the 4 conditions?
92 Four Elements of Locking Lock is initialized to be free Acquire lock before entering critical section Release lock when exiting critical section Wait to acquire lock if another thread already holds it
93 Solve “Too much fish” with lock Peter: Tracy: lock() if (noFeed) { feed fish } unlock()
94 Solve “Too much fish” with lock But this prevents Tracy from doing stuff while Peter is shopping. –I.e. critical section includes the shopping time. How to minimize the time the lock is held?
95 Producer-Consumer Problem Problem: –producer puts things into a shared buffer –Consumer takes them out –Need synchronization for coordinating producer and consumer producerconsumer
96 Producer-Consumer Problem Buffer between producer and consumer allows them to operate somewhat independently
97 Producer-Consumer Problem Otherwise must operate in lockstep –producer puts 1 thing in buffer, then consumer takes it out –then producer adds another, then consumer takes it out, etc
98 Producer-Consumer Problem Coke machine example –delivery person (producer) fills machine with cokes –students (consumer) feed cokes and drink them –coke machine has finite space (buffer)
99 Solution for PC Problem What does producer do when buffer full? What does consumer do when buffer empty? The busy waiting solution in Solution 3 is unacceptable
100 Solution for PC Problem Use Use Sleep and Wakeup primitives Consumer sleeps if buffer is empty –Wake up sleeping producers otherwise Producer sleeps if buffer is full –Wake up sleeping consumers otherwise
101 Sleep and Wakeup #define N 100/*# of slots in the buffer*/ Int count=0;/*# of items in the buffer*/ Void producer(void) {Int item; While(TRUE) { Item=produce_item(); If(count==N) sleep(); Insert_item(item); Count=count+1; If(count==1) wakeup(consumer); } }
102 Sleep and Wakeup Void consumer(void) {Int item; While(TRUE) { If(count==0) sleep(); Item=remove_item(); Count=count-1; If(count==N-1) wakeup(producer); Consume_item(item); } }
103 What are the problem? Producer-consumer problem with fatal race condition –Access to “ count ” is unrestrainted –Wakeup call could get lost Solution is to use semaphore
104 Semaphore Semaphores are like a generalized lock It has a non-negative integer value Semaphore supports the following ops: –down, up
105 Down Operation Wait for semaphore to become positive Then decrement semaphore by 1 –Originally called “P” operation for the Dutch proberen
106 Up Operation Increment semaphore by 1 –originally called “V”, for the Dutch verhogen This wakes up a thread waiting in down –if there are any
107 Semaphores Can also set the initial value of semaphore The key parts in down() and up() are atomic Two down() calls at the same time can’t decrement the value below 0
108 Binary Semaphores Value is either 0 or 1 down() waits for value to become 1 –then sets it to 0 up() sets value to 1 –waking up waiting down (if any)
109 Initial value is 1 (or more generally, N) –down() – –up() Like lock/unlock, but more general Mutual Exclusion w/ Semaphore
110 Semaphores for Ordering Usually (not always) initial value is 0 Example: thread A wants to wait for thread B to finish before continuing –Semaphore initialized to 0 –A B –down() do task –continue execution up()
111 PC Problem with Semaphores Let’s solve the producer-consumer problem in semaphores
112 Semaphore Assignments mutex: –ensures mutual exclusion around code that manipulates buffer queue (initialized to 1) full: –counts the # of full buffers (initialized to 0)
113 Semaphore Assignments empty: –counts the number of empty buffers (initialized to N) Why do we need different semaphores for full and empty buffers?
114 Solve PC with Semaphores #define N 100 /*# of slots in the buffer*/ typedef int semaphore;/*a special kind of int*/ semaphore mutex =1;/*controls access to critical */ semaphore empty =N;/*counts empty buffer slots*/ int full = 0;/*counts full buffer slots */
115 Solve PC with Semaphores void producer(void) {int item; while(TRUE) { item=produce_item(); down(&empty); down(&mutex); insert_item(item); up(&mutex) up(&full); } }
116 Solve PC with Semaphores void consumer(void) {int item; while(TRUE) { down(&full); down(&mutex); item=remove_item(); up(&mutex); up*(&empty); consume_item(item); } }
117 Semaphore Assignments Does the order of the down() function calls matter in the consumer (or the producer)? Does the order of the up() function calls matter in the consumer (or the producer)?
118 Solve PC with Semaphores What (if anything) must change to allow multiple producers and/or multiple consumers? What if there’s 1 full buffer, and multiple consumers call down(full) at the same time?
119 Problem with Semaphore Is there any problem with semaphore? The order of down and up ops are critical –Improper ordering could cause deadlock –i.e. programming in semaphore is tricky How to make programming easier?
120 Monitor A programming language construct A collection of procedures, variables, data structures Access to it is guaranteed to be exclusive –By the compiler (not programmer)
121 Monitor Monitors use separate mechanisms for the two types of synchronization use locks for mutual exclusion use condition variables for ordering constraints
122 Monitor A monitor = a lock + the condition variables associated with that lock
123 Condition Variables Main idea: –make it possible for thread to sleep inside a critical section Approach: –by atomically releasing lock, putting thread on wait queue and go to sleep
124 Condition Variables Each variable has a queue of waiting threads –threads that are sleeping, waiting for a condition Each variable is associated with one lock
125 Monitors Monitor example integer i; condition c; procedure producer(); … end; procedure consumer(); … end end monitor
126 Ops on Condition Variables wait(): –atomically release lock –put thread on condition wait queue, go to sleep –i.e. start to wait for wakeup
127 Ops on Condition Variables signal(): –wake up a thread waiting on this condition variable if any broadcast(): –wake up all threads waiting on this condition variable if any
128 Ops on Condition Variables Note that thread must be holding lock when it calls wait() or signal() To avoid problems when both threads are inside monitor –the signal() must be the last statement
129 Ops on Condition Variables What to do when a thread wakes up? Two options: Let the wakeup thread run –i.e. signaler release the lock Let the caller and callee compete for lock
130 Mesa vs. Hoare Monitors The first option is called Hoare Monitor –It gives special priority to the woken-up waiter –signaling thread gives up lock –woken-up waiter acquires lock –signaling thread re-acquires lock after waiter unlocks
131 Mesa vs. Hoare Monitors The second option is Mesa Monitor –when waiter is woken, it must contend for the lock with other threads –hence must re-check condition –Whoever wins get to run
132 Mesa vs. Hoare Monitors We’ll stick with Mesa monitors –as do most operating systems –Because it is more flexible
133 Programming with Monitors List shared data needed to solve problem Decide how many locks are needed Decide which locks will protect which data
134 Programming with Monitors More locks allows different data to be accessed simultaneously –i.e. protecting finer-grained data –but is more complicated One lock usually enough in this class
135 Programming with Monitors Do not show this slide Put lock...unlock calls around the code that uses shared data List before-after conditions –one condition variable per condition
136 Programming with Monitors Do not show this slide Condition variable’s lock should be the lock that protects the shared data that is used to evaluate the condition
137 Programming with Monitors Call wait() when thread needs to wait for a condition to be true; Use a while loop to re-check condition after wait returns Call signal when a condition changes that another thread might be interested in
138 Producer-Consumer in Monitor Variables: numCokes (number of cokes in machine) One lock to protect this shared data –cokeLock fewer locks make the programming simpler –but allow less concurrency
139 Producer-Consumer in Monitor Ordering constraints: Consumer must wait for producer to fill buffer if all buffers are empty Producer must wait for consumer to empty buffer if buffers is completely full
140 Producer-Consumer in Monitor monitor ProducerConsumer condition full, empty; integer count; procedure insert (item: integer); begin if count == N then wait(full); insert_item(item); count++; if count ==1 then signal(empty); end; function remove: integer; begin if count ==0 then wait(empty); remove = remove_item; count--; if count==N-1 then signal(full); end; count:=0; end monitor;
141 Producer-Consumer in Monitor procedure producer; begin while true do begin item = produce_item; ProducerConsumer.insert(item); end end; procedure consumer; begin while true do begin item=ProducerConsumer.remove; consume_item(item); end end;
142 Do not show this slide Semaphores used for both mutual exclusion and ordering constraints –elegant (one mechanism for both purposes) –code can be hard to read and hard to get right Compare Monitors & Semaphores
143 Do not show this slide Monitor lock is just like a binary semaphore that is initialized to 1 –lock() = down() –unlock() = up() Compare Monitors & Semaphores
144 Condition VariableSemaphores While (cond) {wailt()};Down() Conditional code in user program Conditional code in semaphore definition User writes customized condition Condition specified by semaphore definition (wait if value ==0) User provides shared variables, protect with lock Provides shared variable & thread-safe operations on that integer No memory of past signals“Remembers” past up() calls Condition Variable vs. Semaphore
145 Condition variables are more flexible than using semaphores for ordering constraints Condition variables: –can use arbitrary condition to wait Semaphores: –wait if semaphore value == 0 Condition Variable vs. Semaphore
146 Do not show this slide Semaphores work best if –the shared integer and waiting condition (==0) maps naturally to the problem domain Condition Variable vs. Semaphore
147 Problems with Monitor Many languages do not support Don’t resolve problem when multiple CPUs or computer are involved –This applies to semaphore too Solution? –Message passing
148 Message Passing A high-level primitive for IPC It uses two primitives: send and receive –Send(destination, &message) –Receive(source, &message) They are system calls (not language constructs) Can either block or return immediately
149 Message Passing #define N 100 /*# of slots in the buffer*/Void producer(void) {Int item; message m;/*message buffer*/ While(TRUE) { Item=produce_item(); receive(consumer, &m);/*wait for an empty */ Build_messasge(&m, item);/*construct a msg to*/ Send(consumer, &m); } }
150 Message Passing Void consumer(void) {Int item; message m; for(i=0;i<N;i++) send(producer, &m); /* send N empties */ While(TRUE) { receive(producer, &m); item=extract_item(&m); send(producer, &m); consume_item(item); } }
151 Problems with Message Passing Have many challenging problems Message loss –What happens when a message is lost Authentication –How to determine the identity of the sender? Performance
152 Barriers Another synchronization primitive Intended for a group of processes All processes must reach the barrier for the applications to proceed to next phase
153 Barriers –processes approaching a barrier –all processes but one blocked at barrier –last process arrives, all are let through
154 Implementing Locks So far we have used locks extensively We assumed that lock operations are atomic But how atomicity of lock is implemented?
155 Implementing Locks Low-level atomic operations provide by hardware (i.e. load/store, interrupt enable/disable, test & set) High level synchronization operations provided by software (i.e. semaphore, lock, monitor) Concurrent programs Lock must be implemented on hardware ops
156 Use Interrupt Disable/Enable On uniprocessor, operation is atomic as long as –context switch doesn’t occur in middle of operation How does thread get context switched out? –interrupt Prevent context switches at wrong time by preventing these events
157 Use Interrupt Disable/Enable With interrupt disable/enable to ensure atomicity, –why do we need locks? User program could call interrupt disable –before entering critical section –and call interrupt enable after leaving critical section –and make sure not to call yield in the critical section
158 Lock Implementation #1 Disable interrupts with busy waiting
159 Lock Implementation #1 lock() { disable interrupts while (value != FREE) { enable interrupts disable interrupts } value = BUSY enable interrupts } Why does lock() disable interrupts in the beginning of the function? Why is it ok to disable interrupts in lock()’s critical section it wasn’t ok to disable interrupts while user code was running?
160 Lock Implementation #1 unlock() { disable interrupts value = FREE enable interrupts } Do we need to disable interrupts in unlock()?
161 Problems with Interrupt Approach Interrupt disable works on a uniprocessor –by preventing current thread from being switched out But this doesn’t work on a multi-processor Disabling interrupts on one processor doesn’t prevent other processors from running Not acceptable (or provided) to modify interrupt disable to stop other processors from running
162 Read-modify-write Instructions Another atomic primitive Use atomic load / atomic store instructions –remember Gold Fishsolution #3
163 Read-modify-write Instructions Modern processors provide an easier way –with atomic read modify-write instructions Read-modify-write atomically –reads value from memory into a register –Then writes new value to that memory location
164 Read-modify-write Instructions test_and_set –atomically writes 1 to a memory location (set) –and returns the value that used to be there (test) test_and_set(X) { tmp = X X = 1 return(tmp) }
165 Lock Implementation #2 Test & set with busy waiting (value is initially 0) lock() { while (test_and_set(value) == 1) {} } unlock() { value = 0 }
166 Lock Implementation #2 If lock is free (value = 0) –test_and_set sets value to 1 and returns 0, –so the while loop finishes If lock is busy (value = 1) –test_and_set doesn’t change the value and returns 1, –so loop continues
167 Strategy for Reducing Busy-Waiting In method 1 & 2, Waiting thread uses lots of CPU time just checking for lock to become free Better for it to sleep and let other threads run
168 Lock Implementation #3 Interrupt disable, no busy-waiting Waiting thread gives up processor so that other threads (e.g. thread with lock) can run more quickly Someone wakes up thread when the lock is free
169 Lock Implementation #3 lock() { disable interrupts if (value == FREE) { value = BUSY } else { add thread to queue of threads waiting for this lock switch to next runnable thread } enable interrupts } When should lock() re-enable interrupts before calling switch?
170 Lock Implementation #3 unlock() { disable interrupts value = FREE if (any thread is waiting for this lock) { move waiting thread from waiting queue to ready queue value = BUSY } enable interrupts }
171 Assembly Implementation Mutex_lock: TSL_REGISTER, MUTEX |copy mutex to register and set mutex to 1 CMP REGISTER, #0| was mutex zero? JNE ok|if zero, mutex was unlocked, so return CALL thread_yield|mutex is busy, schedule another thread JMP mutex_lock|try again later Ok:RET|return to caller; CR entered Mutex_unlock: MOVE MUTEX, #0|store a 0 in mutex RET|return to caller
172 Interrupt Disable/Enable Pattern Enable interrupts before adding thread to wait queue? lock() { disable interrupts... if (lock is busy) { enable interrupts add thread to lock wait queue switch to next run-able thread } When could this fail?
173 Interrupt Disable/Enable Pattern Enable interrupts after adding thread to wait queue, but before switching to next thread?
174 Interrupt Disable/Enable Pattern lock() { disable interrupts... if (lock is busy) { add thread to lock wait queue enable interrupts switch to next runnable thread }
175 Interrupt Disable/Enable Pattern But this fails if interrupt happens after thread enable interrupts –lock() adds thread to wait queue –lock() enables interrupts –interrupt causes preemption, –i.e. switch to another thread.
176 Interrupt Disable/Enable Pattern Preemption moves thread to ready queue –Now thread is on two queues (wait and ready)! Also, switch is likely to be a critical section –Adding thread to wait queue and switching to next thread must be atomic
177 Solution Waiting thread leaves interrupts disabled –when it calls switch Next thread to run has the responsibility of –re-enabling interrupts before returning to user code When waiting thread wakes up –it returns from switch with interrupts disabled from the last thread
178 Invariant All threads promise to have interrupts disabled when they call switch All threads promise to re-enable interrupts after they get returned to from switch
179 Thread A Thread B yield() { disable interrupts switch enable interrupts } lock() { disable interrupts... switch back from switch enable interrupts } unlock() (move thread A to ready queue) yield() { disable interrupts switch back from switch enable interrupts }
180 Lock Implementation #4 Test & set, minimal busy-waiting Can’t implement locks using test & set without some amount of busy-waiting –but can minimize it Idea: –use busy waiting only to atomically execute lock code –Give up CPU if busy
181 Lock Implementation #4 lock() { while(test&set(guard)) { } if (value == FREE) { value = BUSY } else { add thread to queue of threads waiting for this lock switch to next runnable thread } guard = 0 }
182 Lock Implementation #4 unlock() { while (test&set(guard)) { } value = FREE if (any thread is waiting for this lock) { move waiting thread from waiting queue to ready queue value = BUSY } guard = 0 }
183 CPU Scheduling How should one choose next thread to run? What are the goals of the CPU scheduler? Bursts of CPU usage alternate with periods of I/O wait
184 CPU Scheduling (a) CPU-bound process (b) I/O bound process
185 CPU Scheduling Minimize average response time –average elapsed time to do each job Maximize throughput of entire system –rate at which jobs complete in the system Fairness –share CPU among threads in some “equitable” manner
186 Scheduling Algorithm Goals All systems Fairness: –giving each process a fair share of the CPU Policy enforcement: –seeing that stated policy is carried out Balance: –keeping all parts of the system busy
187 Scheduling Algorithm Goals Batch systems Throughput: –maximize jobs per hour Turnaround time: –minimize time between submission and termination CPU utilization: –keep the CPU busy all the time
188 Scheduling Algorithm Goals Interactive systems –Response time – respond to requests quickly –Proportionality – meet users’ expectations Real-time systems –Meeting deadlines – avoid losing data –Predictability – avoid quality degradation in multimedia systems
189 FCFS First-Come, First-Served FIFO ordering between jobs No preemption (run until done) –thread runs until it calls yield() or blocks on I/O –no timer interrupts
190 FCFS Pros and cons + simple - short jobs get stuck behind long jobs - what about the user’s interactive experience?
191 FCFS Example –job A takes 100 seconds –job B takes 1 second –time 0: job A arrives and starts –time 0+: job B arrives –time 100: job A ends (response time=100); job B starts –time 101: job B ends (response time = 101) average response time = 100.5
192 Round Robin Goal: –improve average response time for short jobs Solution: –periodically preempt all jobs (viz. long-running ones) Is FCFS or round robin more “fair”?
193 Round Robin Example job A takes 100 seconds job B takes 1 second time slice of 1 second –a job is preempted after running for 1 second
194 Round Robin –time 0: job A arrives and starts –time 0+: job B arrives –time 1: job A is preempted; job B starts –time 2: job B ends (response time = 2) –time 101: job A ends (response time = 101) average response time = 51.5
195 Round Robin Does round-robin always achieve lower response time than FCFS?
196 Round Robin Pros and cons + good for interactive computing - round robin has more overhead due to context switches
197 Round Robin How to choose time slice? big time slice: –degrades to FCFS small time slice: –each context switch wastes some time
198 Round Robin typically a compromise –10 milliseconds (ms) if context switch takes.1 ms –then round robin with 10 ms slice wastes 1% of CPU
199 STCF Shortest Time to Completion First STCF: run whatever job has the least amount of work to do before it finishes or blocks for an I/O
200 STCF STCF-P: preemptive version of STCF if a new job arrives that has less work than the current job has remaining –then preempt the current job in favor of the new one
201 STCF Idea is to finish the short jobs first Improves response time of shorter jobs by a lot –Doesn’t hurt response time of longer jobs by too much
202 STCF STCF gives optimal response time among non- preemptive policies STCF-P gives optimal response time among preemptive policies (and non-preemptive policies)
203 STCF Is the following job a “short” or “long” job? –while(1) { – use CPU for 1 ms – use I/O for 10 ms –}–}
204 STCF Pros and cons + optimal average response time - unfair. Short jobs can prevent long jobs from ever getting any CPU time (starvation) - needs knowledge of future
205 STCF STCF and STCF-P need knowledge of future –it’s often very handy to know the future :-) –how to find out the future time required by a job?
206 Example job A compute for 1000 seconds job B compute for 1000 seconds job C while(1) { use CPU for 1 ms use I/O for 10 ms }
207 Example C can use 91% of the disk by itself A or B can each use 100% of the CPU What happens when we run them together? Goal: keep both CPU and disk busy
208 FCFS if A or B run before C, they prevent C from issuing its disk I/O for –up to 2000 seconds
209 Round Robin with 100 ms time slice –CA B CA B –|--| –C’s–C’s –I/O Disk is idle most of the time that A & B are running –about 10 ms disk time every 200 ms
210 Round Robin with 1 ms time slice –CABABABABABCABABABABABC... –| | | | –C’s C’s –I/O I/O C runs more often, so it can issue its disk I/O almost as soon as its last disk I/O is done
211 Round Robin with 1ms Disk is utilized almost 90% of the time Little effect on A or B’s performance General principle: –first start the things that can run in parallel problem: –lots of context switches (and context switch overhead)
212 STCF-P Runs C as soon as its disk I/O is done –because it has the shortest next CPU burst –CA CA CA –| | | | | | –C’s C’s C’s –I/O I/O I/O
213 Real-Time Scheduling So far, we’ve focused on average-case analysis –average response time, throughput Sometimes, the right goal is to get each job done before its deadline –irrelevant how much before deadline job completes
214 Real-Time Scheduling Video or audio output. –E.g. NTSC outputs 1 TV frame every 33 ms Control of physical systems –e.g. auto assembly, nuclear power plants
215 Real-Time Scheduling This requires worst-case analysis How do we do this in real life?
216 EDF Earliest-deadline first Always run the job that has the earliest deadline –i.e. the deadline coming up next If a new job arrives with an earlier deadline than the currently running job –preempt the running job and start the new one
217 EDF EDF is optimal –it will meet all deadlines if it’s possible to do so
218 EDF Example job A: takes 15 seconds, deadline is 20 seconds after entering system job B: takes 10 seconds, deadline is 30 seconds after entering system job C: takes 5 seconds, deadline is 10 seconds after entering system
219 EDF time---> A + B + C +
220 Not all set of tasks are schedulable in RT Systems Given –m periodic events –event i occurs within period P i and requires C i seconds Then the load can only be handled if Schedulability in Real-Time Systems
221 Scheduling in Batch Systems First come first serve Shortest job first Shortest remaining time next Three-level scheduling
222 Scheduling in Batch Systems An example of shortest job first scheduling
223 Scheduling in Batch Systems Three level scheduling
224 Scheduling in Interactive Systems Round-robin scheduling Priority scheduling –Run highest priority process until it blocks or exits –Alternative, decrease priority at each clock tick Multiple queue –Dividing priority into classes –Dynamically adjust a process’s priority class
225 Scheduling in Interactive Systems Shortest process next –Estimate the running time of processes Guaranteed scheduling –Each process runs 1/n fraction of time Lottery scheduling –Issue lottery tickets to process & schedule accordingly Fair share scheduling per user
226 Scheduling in Interactive Systems Round Robin Scheduling –list of runnable processes –list of runnable processes after B uses up its quantum
227 A scheduling algorithm with four priority classes Scheduling in Interactive Systems
228 Policy versus Mechanism Separate what is allowed to be done –with how it is done Important in thread scheduling –a process knows which of its children threads are important and need priority
229 Policy versus Mechanism Scheduling algorithm parameterized –mechanism in the kernel Parameters filled in by user processes –policy set by user process
230 Thread Scheduling Possible scheduling of user-level threads
231 Thread Scheduling Possible scheduling of kernel- level threads
Thoughts Change Life 意念改变生活