Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling Rev. by Kyungeun Park, 2015.

Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling
Rev. by Kyungeun Park, 2015

Chapter 5: Process Synchronization

Chapter 5 : Process Synchronization
Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Mutex Locks Semaphores Classic Problems of Synchronization Monitors Synchronization Examples Alternative Approaches

Objectives To introduce the critical-section problem, whose solutions can be used to ensure the consistency of shared data To present both software and hardware solutions of the critical-section problem To examine several critical process-synchronization problems To explore several tools that are used to solve process synchronization problems

Background Processes can execute concurrently
My be interrupted at any time, partially completing execution Concurrent access of cooperating processes or threads to shared data may result in data inconsistency. Maintaining data consistency requires mechanisms to ensure the orderly execution of cooperating processes. Illustration of the problem: Suppose that we wanted to provide a solution to the consumer-producer problem that fills all the buffers. We can do so by having an integer count that keeps track of the number of full buffers. Initially, count is set to 0. It is incremented by the producer after it produces a new buffer and is decremented by the consumer after it consumes a buffer.

Bounded-Buffer – Shared-Memory Solution
How concurrent or parallel execution affects the integrity of data shared by several processes? Shared buffer: a circular array with two logical pointers: in and out #define BUFFER_SIZE 10 typedef struct { . . . } item; item buffer[BUFFER_SIZE]; int in = 0; /* the next free position in the buffer */ int out = 0; /* the first full position in the buffer */ in Initial Empty Buffer (size 10) 1 2 9 out

Producer while (true) { /* Produce an item and put in nextProduced */
while (counter == BUFFER_SIZE) ; // do nothing buffer [in] = nextProduced; in = (in + 1) % BUFFER_SIZE; counter++; } * counter++ could be implemented in machine language as register1 = counter register1 = register counter = register1

Consumer while (true) { while (counter == 0) ; // do nothing
nextConsumed = buffer[out]; out = (out + 1) % BUFFER_SIZE; counter--; /* Consume the item in nextConsumed */ } * counter-- could be implemented in machine language as register2 = counter register2 = register count = register2

 Race Condition Both the producer and consumer routines may not function correctly when executed concurrently. Consider this execution interleaving with “count = 5” initially: S0: producer execute register1 = counter {register1 = 5} S1: producer execute register1 = register {register1 = 6} S2: consumer execute register2 = counter {register2 = 5} S3: consumer execute register2 = register {register2 = 4} S4: producer execute counter = register {counter = 6 } S5: consumer execute counter = register {counter = 4} Race Condition : a situation where several processes access and manipulate the same data concurrently and the outcome of the execution depends on the particular order in which the access takes place. Ensure that only one process at a time can manipulate the variable counter. Process synchronization and coordination among cooperating processes

Critical Section Problem
 Critical Section Problem Consider system of n processes {p0, p1, … pn-1} Each process has a segment of code, critical section, where Process may be changing common variables, updating a table, writing a file, etc. When one process is executing in critical section, no other process is allowed to execute in its critical section. Critical section problem is to design a protocol that the processes can use to cooperate. Each process must ask permission to enter its critical section in the entry section. Followed by an exit section, then remainder section. Kernel code  cause race conditions Ex) a list of all open files maintained by an operating system. data structures for maintaining memory allocation and process lists or for interrupt handling

Critical Section  General structure of process pi is do {
remainder section } while (TRUE) entry section exit section

Solution to Critical-Section Problem (1)
 Solution to Critical-Section Problem (1) Three requirements: 1. Mutual Exclusion - If process Pi is executing in its critical section, then no other processes can be executing in their critical sections. 2. Progress - If no process is executing in its critical section and there exist some processes that wish to enter their critical section, then only those processes that are not executing in their remainder sections can participate in deciding which will enter its critical section next, and this selection cannot be postponed indefinitely. 3. Bounded Waiting - There exists a bound, or limit on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted. Note Assume that each process executes at a nonzero speed No assumption concerning relative speed of the n processes

Solution to Critical-Section Problem (2)
 Solution to Critical-Section Problem (2) Two general approaches used to handle critical sections in operating systems Preemptive kernel – allows preemption of process when running in kernel mode Ex) two kernel-mode processes running simultaneously on different processors in SMP environment Non-preemptive kernel – a process runs until it exits kernel mode, blocks, or voluntarily yields CPU Essentially free from race conditions on kernel data structures Then why preemptive kernel over non-preemptive one? preemptive kernel : complicated, but more responsive and more suitable for real-time programming allowing a real-time process to preempt a process currently running in the kernel

Peterson’s Solution Peterson’s solution: a classic software-based solution to the critical-section problem Worth reviewing the underlying algorithm to solve the critical section problem and the way of reasoning how to satisfy three requirements Restricted to two processes, P0 and P1, that alternate execution between their critical sections and remainder sections

Algorithm for Process Pi
 Algorithm for Process Pi do { flag[i] = true; turn = j; while (flag[j] && turn == j); critical section flag[i] = false; remainder section } while (TRUE); * at Process Pi do { flag[j] = true; turn = i; while (flag[i] && turn == i); critical section flag[j] = false; remainder section } while (TRUE); * at Process Pj The two processes share two data items: int turn; boolean flag[2] The flag array is used to indicate if a process is ready to enter the critical section. flag[i] = true implies that process Pi is ready! The variable turn indicates whose turn it is to enter the critical section if both processes try to enter at the same time, turn will be set to both i and j at roughly the same time  only one of these assignments will last the value of turn determines which of the two processes is allowed to enter its critical section first

Algorithm for Process Pi
 Algorithm for Process Pi do { flag[i] = true; turn = j; while (flag[j] && turn == j); critical section flag[i] = false; remainder section } while (TRUE); * at Process Pi Prove: assume both processes executes in their critical sections at the same time Mutual exclusion is preserved: When flag[j]=false or turn==i, each Pi enters its critical section. To exit while loop and enter their critical sections, as flag[i]=true and flag[j]=true (insufficient)  So the value of turn should be 0 for P0 and 1 for P1 at the same time : impossible! Progress requirement is satisfied: Bounded-waiting requirement is met: Pi can be prevented from entering the CS only if it is stuck in the while loop If Pj is not ready to enter the critical section, then flag[j]=false and Pi can enter its CS If Pj has set flag[j] to true and is also executing in its while loop, then either turn==i or turn==j.  If turn==i then Pi will enter the CS: If turn==j then Pj will enter the CS. If Pj exits its CS, it will reset flag[j] to false, allowing Pi to enter its CS.  If Pj resets flag[j] to true immediately, it must also set turn to i. Since no change on turn in Pi  Pi will enter its CS (Progress) after at most one entry by Pj (bounded waiting). do { flag[j] = true; turn = i; while (flag[i] && turn == i); critical section flag[j] = false; remainder section } while (TRUE); * at Process Pj turn would be 0 or 1

Synchronization Hardware
 Synchronization Hardware Many systems provide hardware support for critical section code All solutions (from hardware to software-based APIs) based on idea of locking Protecting critical regions through the use of locks Uniprocessors – could disable interrupts while modifying a shared variable Currently running code would execute without preemption No other instructions would be run, so no unexpected modifications to the shared variables Generally too inefficient on multiprocessor environments Disabling interrupt for all processors by sending the message: causes time consuming and delays Modern machines provide special atomic hardware instructions Atomic : non-interruptable, uninterruptible Allow us either test memory word and set the value : TestAndSet( ) instruction Or swap contents of two memory words : CompareAndSwap( ) instruction

TestAndSet() Instruction
 TestAndSet() Instruction Definition: run atomically as an uninterruptible instruction if two instructions are executed simultaneously on a different CPU, they will be executed sequentially in some order boolean TestAndSet (boolean *target) { boolean rv = *target; *target = TRUE; return rv: }

Solution using TestAndSet()
 Solution using TestAndSet() Shared boolean variable lock, initialized to FALSE Solution: do { while (TestAndSet (&lock )) /* run as an uninterruptible unit */ ; // do nothing // critical section lock = FALSE; // remainder section } while (TRUE);

CompareAndSwap() Instruction
Definition: operates on three operands. int CompareAndSwap (int *value, int expected, int new_value) { int temp = *value; if (*value == expected) *value = new value; return temp; }

Solution using CompareAndSwap()
Global variable, lock initialized to 0 The first process will set lock to 1 and enter its CS because lock == 0 inside CompareAndSwap() instruction. When the process exits the CS, it sets lock back to 0, allowing another process to enter its CS. Solution: do { while (CompareAndSwap(&lock, 0, 1) != 0) ; /* do nothing */ // critical section lock = 0; // remainder section } while (TRUE); ME is met; but bounded waiting is not satisfied

Bounded-waiting Mutual Exclusion with TestAndSet()
 Bounded-waiting Mutual Exclusion with TestAndSet() The two processes share two data items: boolean waiting[n]; // initialized to False boolean lock; // initialized to False Mutual Exclusion: Pi can enter its CS only if either waiting[i] == false or key == false key = false only if TestAndSet() executed ex) the first process of TestAndSet() waiting[i] = false if another process leaves its CS Progress: since a process exiting the CS either sets lock to false or sets waiting[j] to false, allowing a process waiting to enter its CS to proceed Bounded Waiting Time: a leaving process scans the array waiting in the cyclic ordering. Any waiting process will enter CS within n-1 times. do { // lock : false, waiting[n] : all false waiting[i] = true; key = true; while (waiting[i] && key) key = TestAndSet(&lock); waiting[i] = false; // critical section j = (i + 1) % n; while ((j != i) && !waiting[j]) j = (j + 1) % n; if (j == i) // one cycle lock = false; // initial state else // within n-1 times waiting[j] = false; // take turns // remainder section } while (true);

Bounded-waiting Mutual Exclusion with TestandSet()
do { waiting[i] = TRUE; key = TRUE; while (waiting[i] && key) key = TestAndSet(&lock); waiting[i] = FALSE; // critical section j = (i + 1) % n; while ((j != i) && !waiting[j]) j = (j + 1) % n; if (j == i) lock = FALSE; else waiting[j] = FALSE; // remainder section } while (TRUE); * at Process Pi do { waiting[j] = TRUE; key = TRUE; while (waiting[j] && key) key = TestAndSet(&lock); waiting[j] = FALSE; // critical section i = (j + 1) % n; while ((i != j) && !waiting[i]) i = (i + 1) % n; if (i == j) lock = FALSE; else waiting[i] = FALSE; // remainder section } while (TRUE); * at Process Pj

 Mutex Locks Previous solutions are complicated and generally inaccessible to application programmers OS designers build software tools to solve critical section problem Simplest is mutex lock A process should acquire a lock before entering a critical section; It releases the lock when it exits the critical section. Protect critical regions and prevent race conditions: By first acquire() a lock, then release() it Boolean variable indicating if lock is available or not Calls to acquire() and release() must be atomic Usually implemented via hardware atomic instructions: using TestAndSet() or CompareAndSwap() instructions Disadv) But this solution requires busy waiting This lock therefore called a spinlock good for short term lock (No context-switch) Busy waiting wastes CPU cycles that some other processes uses productively. do { acquire lock critical section release lock remainder section } while (true);

acquire() and release()
acquire() { while (!available) ; /* busy wait */ available = false;; } release() { available = true; Solution to the critical section problem using mutex lock do { acquire lock critical section release lock remainder section } while (true);

  Semaphore Previous hardware-based solutions: hard for application programmers to use. Synchronization tool, similar to a mutex lock, a semaphore is introduced. Semaphore S – integer variable Generally a counter used to provide access to a shared data object for multiple processes. Accessed only through two standard atomic operations: wait() and signal() wait() originally termed P, proberen, “to test” signal() originally called V, verhogen, “to increment” Less complicated Can only be accessed via two indivisible (atomic) operations wait (S) { while S <= 0; // no-op, busy wait S--; } signal (S) { S++;

Semaphore as General Synchronization Tool
 Semaphore as General Synchronization Tool Counting semaphore – integer value can range over an unrestricted domain Used to control access to a given resource consisting of a finite number of instances. The semaphore is initialized to the number of resources available. wait(S) when wishing to use a resource , signal(S) when releasing a resource Binary semaphore – integer value can range only between 0 and 1; Also known as mutex locks, initialized to 1 Can implement a counting semaphore S as a binary semaphore Provides mutual exclusion Semaphore mutex; // initialized to 1 do { wait (mutex); // Critical Section signal (mutex); // remainder section } while (TRUE);

Semaphore as General Synchronization Tool
Synchronization problems Two concurrently running processes: P1 and P2 P1 with a statement S1 P2 with a statement S2 S2 to be executed only after S1 has completed Common semaphore synch, initialized to 0, for P1 and P2 S1; signal(synch); // in process P1 wait(synch); // in process P2 S2; Note: because synch is initialized to 0, P2 will execute S2 only after P1 has invoked signal(synch)

Characteristics of Semaphore
 Characteristics of Semaphore Resource sharing by a process: Test the semaphore that controls the resource Positive semaphore value  use the resource, decrement by 1 0 semaphore value  go to sleep until (semaphore > 0) Done with a shared resource  increment by 1, awakening sleeping processes Main disadvantage of the semaphore definition is that it requires busy waiting. While a process is in its critical section, any other process that tries to enter its critical section must loop continuously in the entry code. Continual looping  a problem in a multiprogramming system with a single CPU Busy waiting wastes CPU cycle that some other process might use. This type of semaphore is called a spinlock (spinning while waiting for the lock) Useful for multiprocessor systems as the alternative to the context switch when locks are expected to be held for short time.

Semaphore Implementation with no Busy waiting
 Semaphore Implementation with no Busy waiting To overcome the need for busy waiting: With each semaphore there is an associated waiting queue, Modification of the definition of the wait() and signal() semaphore operations a process executes wait() and finds the semaphore is not positive  wait and rather than busy waiting, block itself placing it into a waiting queue associated with the semaphore process state to waiting state control to the CPU scheduler resulting in selecting another process the process waiting on a semaphore should be restarted when some other process executes a signal() Two operations provided by the OS as basic system calls block() – places the process invoking the operation on the appropriate waiting queue, suspending the process that invokes it.) wakeup() – remove one of processes in the waiting queue and place it in the ready queue to restart it, changing from the waiting state to the ready state.

Semaphore Implementation with no Busy waiting
 Semaphore Implementation with no Busy waiting Each entry (semaphore) in a waiting queue has two data items: value (of type integer) list : pointer to a list of processes typedef struct { int value; struct process * list; } semaphore;

Revised Semaphore Implementation
 Revised Semaphore Implementation Implementation of wait() : wait(semaphore *S) { Svalue--; // semaphore value may be negative, identifying # of if (Svalue < 0) { // processes waiting on that semaphore add this process to Slist; block(); // suspends the process that invokes it } Implementation of signal() : signal(semaphore *S) { Svalue++; if (Svalue <= 0) { remove a process P from Slist; wakeup(P); // resumes the execution of a blocked process P

Semaphore Implementation
Must guarantee that no two processes can execute wait () and signal () on the same semaphore at the same time Thus, implementation becomes the critical section problem where the wait and signal code are placed in the critical section In a single-processor environment, solved by disabling interrupts during the time the wait() and signal() operations are executing. In a multiprocessor environment, interrupts must be disabled on every processor. This is a difficult task and causes performance degradation. SMP system must provide a locking techniques, such as spinlocks. Still, have busy waiting with wait() and signal() operations in critical section implementation Problem moved from the entry section to the critical sections of wait() and signal() operations But they are usually short critical section rarely occupied Note that applications may contain long CSs and spend lots of time  Busy waiting is not a good solution.

Deadlock and Starvation
 Deadlock and Starvation Deadlock – two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes. Definition (from Wiki ) A deadlock is the situation where a group of processes blocks forever because each of the processes is waiting for resources which are held by another process in the group. Let S and Q be two semaphores initialized to 1 P P1 wait (S); wait (Q); wait (Q); wait (S); … … signal (S); signal (Q); signal (Q); signal (S); Starvation – indefinite blocking (LIFO queue) A process may never be removed from the semaphore queue in which it is suspended Priority Inversion – Scheduling problem when lower-priority process holds a lock needed by higher-priority process Solved via priority-inheritance protocol (by inheriting the higher priority until the lower priority process finishes using resource R, needed by the higher-priority process.

Classical Problems of Synchronization
Classical problems used to test newly-proposed synchronization schemes, semaphores (mutex locks) Bounded-Buffer Problem Readers and Writers Problem Dining-Philosophers Problem

Bounded-Buffer Problem
 Bounded-Buffer Problem N buffers, each can hold one item Semaphore mutex, initialized to the value 1, providing mutual exclusion for accesses to the buffer pool Semaphore full, initialized to the value 0, count the # of full buffers Semaphore empty, initialized to the value N, count the # of empty buffers

Bounded Buffer Problem with Semaphore
 Bounded Buffer Problem with Semaphore The structure of the producer process do { // produce an item // in next_produced wait (empty); wait (mutex); // add the item to the buffer signal (mutex); signal (full); } while (TRUE); The structure of the consumer process do { wait (full); wait (mutex); // remove an item // from buffer to next_consumed signal (mutex); signal (empty); // consume the item // in next_consumed } while (TRUE);

Readers-Writers Problem
A database is shared among several concurrent processes Readers – only read the database; they do not perform any updates Writers – can update (both read and write) the database Database access by reader(s) and writer(s), simultaneously causes chaos. Require that the writers have exclusive access to the shared database while writing to the database Synchronization problem: Readers-writers problem Problem – allow multiple readers to read at the same time Only one single writer can access the shared data exclusively Shared Data Semaphore mutex, initialized to 1, ensures mutual exclusion when read_count is updated Semaphore rw_mutex, initialized to 1, common to both reader and writer processes, functions as a mutual exclusion semaphore for the writers, and used by the first and last reader, and not used by readers while other readers are in their CS. Integer read_count, initialized to 0, keeps track of how many processes are currently reading the object.

Readers-Writers Problem (Cont.)
  Readers-Writers Problem (Cont.) The structure of a writer process do { wait (rw_mutex) ; // writing is performed signal (rw_mutex) ; } while (TRUE); The structure of a reader process do { wait (mutex) ; // down to access read_count read_count++ ; if (read_count == 1) // if first process to read wait (rw_mutex) ; // wait until no write signal (mutex) // up for releasing read_count // reading is performed wait (mutex) ; // down mutex read_count-- ; if (read_count == 0) // if no more reader signal (rw_mutex) ; // allow a writer signal (mutex) ; // up mutex } while (TRUE);

Readers-Writers Problem Variations
Several variations of how readers and writers are treated – all involve priorities First variation – no reader kept waiting unless writer has permission to use shared object Second variation – once writer is ready, it performs write ASAP Both may have starvation leading to even more variations First  Writer starvation Second  Reader starvation Problem is solved on some systems by kernel providing reader-writer locks Specifying the mode of the lock: either read or write access Only for reading?  acquire the reader-writer lock in read mode Wishing to modify?  request the lock in write mode Useful for applications that have more readers than writers due to the overhead to establish than semaphores or mutual-exclusion locks

Dining-Philosophers Problem
 Dining-Philosophers Problem Each philosophers must alternately think and eat. Originally formulated in 1965 by E. Dijkstra as a competing problem for access to tape drive peripherals (Wiki) Don’t interact with their neighbors, occasionally try to pick up 2 (left and right) chopsticks (one at a time) to eat from bowl Need both to eat, then release both when done In the case of 5 philosophers Shared data Bowl of rice (data set): Infinite supply Semaphore chopstick [5], initialized to 1 (Wiki)

Dining-Philosophers Problem Algorithm
 Dining-Philosophers Problem Algorithm The structure of Philosopher i: do { wait ( chopstick[i] ); // wait for left chopstick wait ( chopstick[ (i + 1) % 5] ); // wait for right chopstick // eat for a while signal ( chopstick[i] ); signal ( chopstick[ (i + 1) % 5] ); // think for a while } while (TRUE); What is the problem with this algorithm? Deadlock: simultaneously grab left chopstick and delayed forever Starvation?

Problems with Semaphores
Incorrect use of semaphore operations  causes various types of errors Typical CS solution with semaphore (mutex, initialized to 1): wait (mutex) …. signal (mutex) what if signal (mutex) …. wait (mutex) ?  ME violation because several processes in CS, but found only if it happens wait (mutex) … wait (mutex)  Deadlock Omitting of wait (mutex) or signal (mutex) (or both)  ME violation or Deadlock As a solution to this incorrect use of semaphores, high-level language constructs, the monitor type has been developed.

 Monitors A high-level abstraction that provides a convenient and effective mechanism for process synchronization A monitor type is an abstract data type, a collection of shared variables and operations on those variables: only accessible by code within the procedure Only one process (thread) may be active within the monitor at a time  Implicit mutual exclusion, No need to code the synchronization constraint explicitly. monitor monitor-name { // shared variable declarations procedure P1 (…) { …. } procedure Pn (…) {……} Initialization code (…) { … } } But not powerful enough to model some synchronization schemes Additional explicit synchronization mechanisms to write a tailor-made synchronization scheme Provided by the condition construct

Schematic view of a Monitor
 Schematic view of a Monitor

Condition Variables  Monitor condition variables
condition x, y; // condition type variables Only two operations on a condition variable: x.wait () – a process invoking the operation is suspended until x.signal () is invoked Put executing process (thread) at the end of the condition variable queue Releases access to the monitor x.signal () – resumes one of processes (if any) that invoked x.wait () If no x.wait () on the variable, then it has no effect on the variable c.f. signal() associated with semaphores : always affect the state of semaphore

Monitor with Condition Variables

Condition Variables Choices
 Condition Variables Choices If process P invokes x.signal (), with Q in x.wait () state, what should happen next? If Q is resumed, then P must wait within the monitor. Otherwise, both P and Q processes would be active simultaneously within the monitor. Two possibilities exist: Signal and wait – P waits until Q leaves monitor or waits for another condition Signal and continue – Q waits until P leaves the monitor or waits for another condition Both have pros and cons – language implementer can decide Monitors implemented in Concurrent Pascal compromise P executing signal() immediately leaves the monitor. Hence Q is resumed. (signal_and_leave) Implemented in other languages including Java and C#.

Dining-Philosophers Solution Using Monitors
 Dining-Philosophers Solution Using Monitors Deadlock-free solution to the dining-philosophers problem Restriction: A philosopher may pick up her chopsticks only if both of them are available. Additional data structure to distinguish the philosophers’ state enum {THINKING, HUNGRY, EATING} state[5]; //Three states of a philosopher condition self[5]; Each philosopher i invokes the operations pickup() and putdown() in the following sequence: DiningPhilosophers.pickup(i); EAT DiningPhilosophers.putdown(i); No deadlock, but starvation is possible!

Solution to DP using Monitor(Cont.)
 Solution to DP using Monitor(Cont.) monitor DiningPhilosophers { enum { THINKING, HUNGRY, EATING } state [5] ; condition self [5]; void pickup (int i) void putdown (int i) void test (int i) void initialization_code() }

Solution to DP using Monitor(Cont.)
 Solution to DP using Monitor(Cont.) monitor DiningPhilosophers { enum { THINKING, HUNGRY, EATING } state [5] ; condition self [5]; void pickup (int i) { state[i] = HUNGRY; test(i); if (state[i] != EATING) self [i].wait(); } void putdown (int i) { state[i] = THINKING; // test left and right neighbors test((i + 4) % 5); test((i + 1) % 5); void test (int i) { if ( (state[(i + 4) % 5] != EATING) && (state[i] == HUNGRY) && (state[(i + 1) % 5] != EATING) ) { state[i] = EATING; self[i].signal(); } void initialization_code() { for (int i = 0; i < 5; i++) state[i] = THINKING;

Monitor Implementation Using Semaphores
Variables semaphore mutex; // (initially = 1), provided for each monitor semaphore next; // (initially = 0), used by the signaling process // to suspend themselves and wait until the // resumed process either leaves or waits // (signal_and_wait) int next_count = 0; // provided to count the # of processes suspended on next Each external function F will be replaced by: wait(mutex); // before entering the monitor … body of F; if (next_count > 0) // if there is process suspended on next signal(next) else signal(mutex); // after leaving the monitor Mutual exclusion within a monitor is implicitly ensured

Condition Variables Implementation in Monitor
For each condition variable x, we have: semaphore x_sem; // (initially = 0) int x_count = 0; The operation x.wait() can be implemented as: x_count++; if (next_count > 0) // unlock the monitor for signal(next); // processes waiting outside else // the monitor signal(mutex); wait(x_sem); // wait on the CV queue x_count--; // can be taken out of CV q. The operation x.signal() can be implemented as: if (x_count > 0) { // only if processes on CV next_count++; signal(x_sem); wait(next); // signal and wait  // causes signaling process // to block (by Hoare style) next_count--; }

Resuming Processes within a Monitor
If several processes queued on condition x, and x.signal() executed, which should be resumed? Process-resumption order: First-come, first-served (FCFS) : frequently, but not adequate in some cases conditional-wait construct of the form: x.wait(c) where c is priority number (an integer expression): used as the priority number of a process stored with the name of the process Process with lowest number (highest priority) is scheduled next

A Monitor to Allocate Single Resource
 A Monitor to Allocate Single Resource ResourceAllocator monitor controls the allocation of a single resource among competing processes. Requesting process specifies the maximum time it plans to use the resource. monitor ResourceAllocator { boolean busy; condition x; void acquire(int time) { if (busy) x.wait(time); // monitor allocates the resource to the process busy = TRUE; // with the shortest time-allocation request } void release() { busy = FALSE; x.signal(); initialization code() {

Synchronization Examples
Windows XP Linux Solaris Pthreads

Windows Synchronization
Windows OS is a multithreaded kernel for real-time applications and multiple processors When kernel accesses a global resource: a single processor system: temporarily masks interrupts for all interrupt handlers to protect access to global resources a multiprocessor system: protects access to global resources using spinlocks (short code segments), non-preempting threads while holding a spinlock Spinlocking-thread will never be preempted Also provides dispatcher objects for thread synchronization outside the kernel, which may act mutexes, semaphores, events, and timers Events An event acts much like a condition variable Timers notify one or more thread when time expired Dispatcher objects in either signaled-state (object available) or non-signaled state (object not available, thread will block)

Linux Synchronization
Prior to kernel Version 2.6, disables interrupts (non-preemptive) to implement short critical sections Now (Version 2.6 and later), fully preemptive Atomic simple math operations for the simplest synchronization technique: atomic_t, atomic_set(), atomic_add, atomic_sub(), atomic_inc(), atomic_read() Linux provides synchronization mechanisms in the kernel: mutex locks semaphores spinlocks reader-writer versions of both For SMP, spinlock is provided On single-processor system, spinlocks are inappropriate for use and are replaced by Enabling kernel preemption, rather than releasing a spinlock (SMP) preempt_enable() Disabling kernel preemption, rather than holding a spinlock (SMP) preempt_disable()

Solaris Synchronization
 Solaris Synchronization Implements a variety of locks to support multitasking, multithreading (including real-time threads), and multiprocessing Uses adaptive mutex for efficiency when protecting critical data item from short code segments On SMP: adaptive mutex starts as a standard semaphore implemented as a spinlock If the data are locked and in use, two choices: either spin or block If held by a thread running on another CPU, then the thread spins If lock held by non-run-state thread, the thread blocks and sleeps waiting for signal of lock being released Uses condition variables and semaphores for longer code segment Uses readers-writers locks when protecting data accessed frequently but in a read-only manner (multiple readers concurrently) and useful for long sections of code Uses turnstiles, a queue structure containing threads blocked on a lock, to order the list of threads waiting to acquire either an adaptive mutex or reader-writer lock Turnstiles are per-lock-holding-thread, not per-object Priority-inheritance per-turnstile gives the running thread the highest of the priorities of the threads in its turnstile

Pthreads Synchronization
Pthreads API is OS-independent It provides: mutex locks : fundamental synchronization technique used with Pthreads condition variables Non-portable extensions include: semaphores spinlocks

The Deadlock Problem In a multiprogramming environment, a set of blocked processes each holding a resource and waiting to acquire a resource held by another process in the set Example System has 2 disk drives P1 and P2 each hold one disk drive and each needs another one semaphores A and B, initialized to 1 P0 and P1 wait (A); wait(B) wait (B); wait(A) 61

Bridge Crossing Example
Traffic only in one direction Each section of a bridge can be viewed as a resource If a deadlock occurs, it can be resolved if one car backs up (preempt resources and rollback) Several cars may have to be backed up if a deadlock occurs Starvation is possible Note – Most OSes do not prevent or deal with deadlocks 62

System Model A system consists of a finite number of resources to be distributed among a number of competing processes. Resource types R1, R2, . . ., Rm CPU cycles, memory space, files, I/O devices Each resource type Ri has Wi instances. Each process utilizes a resource in only the following sequence request use release 63

Deadlock Characterization
Deadlock can arise if the following four conditions hold simultaneously. Mutual exclusion: only one process at a time can use a resource in a nonsharable mode Hold and wait: a process holding at least one resource is waiting to acquire additional resources held by other processes No preemption: a resource can be released only voluntarily by the process holding it, after that process has completed its task Circular wait: there exists a set {P0, P1, …, Pn} of waiting processes such that P0 is waiting for a resource that is held by P1, P1 is waiting for a resource that is held by P2, …, Pn–1 is waiting for a resource that is held by Pn, and Pn is waiting for a resource that is held by P0. 64

Simple Resource Deadlock

Related: Indefinite postponement
 Related: Indefinite postponement Also called indefinite blocking or starvation Occurs due to biases in a system’s resource scheduling policies Aging Technique that prevents indefinite postponement by increasing process’s priority as it waits for resource COP Operating Systems

Resource-Allocation Graph
 Resource-Allocation Graph A set of vertices V and a set of edges E V is partitioned into two types: P = {P1, P2, …, Pn}, the set consisting of all the processes in the system R = {R1, R2, …, Rm}, the set consisting of all resource types in the system E is directed edge from process to resource type request edge – directed edge Pi  Rj assignment edge – directed edge Rj  Pi 67

Resource-Allocation Graph (Cont.)
 Resource-Allocation Graph (Cont.) Process Resource Type with 4 instances Pi requests instance of Rj Pi is holding an instance of Rj Pi request edge – directed edge Pi  Rj Rj Pi assignment edge – directed edge Rj  Pi Rj 68

Example of a Resource Allocation Graph
69

Resource Allocation Graph With A Deadlock
 Resource Allocation Graph With A Deadlock 70

Graph With A Cycle But No Deadlock
 Graph With A Cycle But No Deadlock 71

Basic Facts If graph contains no cycles  no deadlock
If graph contains a cycle  if only one instance per resource type, then deadlock if several instances per resource type, possibility of deadlock 72

Methods for Handling Deadlocks
Prevent or avoid deadlocks by ensuring that the system will never enter a deadlock state Allow the system to enter a deadlock state and then recover Ignore the problem and pretend that deadlocks never occur in the system; used by most operating systems, including UNIX 73

Dealing with Deadlock  Deadlock prevention Deadlock avoidance
Deadlock detection Deadlock recovery

Deadlock Prevention Includes a set of methods for ensuring that at least one of the necessary conditions cannot hold. Prevent deadlocks by constraining request operations To prevent Mutual Exclusion – not required for sharable resources; must hold for nonsharable resources To prevent Hold and Wait – must guarantee that whenever a process requests a resource, it does not hold any other resources Require process to request and be allocated all its resources before it begins execution, or allow process to request resources only when the process has none Low resource utilization; starvation possible 75

Deadlock Prevention (Cont.)
To prevent No Preemption – If a process that is holding some resources requests another resource that cannot be immediately allocated to it, then all resources currently being held are released Preempted resources are added to the list of resources for which the process is waiting Process will be restarted only when it can regain its old resources, as well as the new ones that it is requesting To prevent Circular Wait – impose a total ordering of all resource types, and require that each process requests resources in an increasing order of enumeration 76

Deadlock Avoidance Requires that the system has some additional information concerning which resources a process will request and use during its lifetime Simplest and most useful model requires that each process declare the maximum number of resources of each type that it may need The deadlock-avoidance algorithm dynamically examines the resource-allocation state to ensure that there can never be a circular-wait condition Resource-allocation state is defined by the number of available and allocated resources, and the maximum demands of the processes 77

Chapter 6: CPU Scheduling

Chapter 5: CPU Scheduling
Basic Concepts Scheduling Criteria Scheduling Algorithms Thread Scheduling Multiple-Processor Scheduling Real-Time CPU Scheduling Operating Systems Examples Algorithm Evaluation

Objectives To introduce CPU scheduling, which is the basis for multiprogrammed operating systems To describe various CPU-scheduling algorithms To discuss evaluation criteria for selecting a CPU-scheduling algorithm for a particular system To examine the scheduling algorithm of several operating systems

Basic Concepts Maximum CPU utilization obtained with multiprogramming
CPU–I/O Burst Cycle – Process execution consists of a cycle of CPU execution and I/O wait CPU burst : the amount of time a process uses CPU. Long burst : process is CPU-bound Short burst : process is IO-bound CPU burst followed by I/O burst CPU burst distribution will be the key factor to enhance CPU utilization.

Histogram of CPU-burst Times
 Histogram of CPU-burst Times Tend to be I/O-bound program Tend to be CPU-bound program Generally, a large number of short CPU bursts and a small number of long CPU bursts

CPU Scheduler (short-term scheduler)
Short-term scheduler selects a process from the processes in ready queue, and allocates the CPU to that process Queue may be ordered in various ways: FIFO Queue, Priority Queue, Tree, etc. CPU scheduling decisions needed when a process: Switches from running to waiting state (I/O request or invocation of wait()) : no choice Switches from running to ready state (interrupt request) : choice* Switches from waiting to ready (completion of I/O) : choice* Terminates : no choice Non-preemptive scheduling scheme : scheduling only under 1 and 4 Preemptive scheduling scheme : all other cases E.g. by allowing an interrupt, the OS may need to accept interrupts at almost all times. Careful consideration is needed: access to shared data preemption while in kernel mode interrupts occurring during crucial OS activities

Process States (from Chapter 3)
2 4 1 3 1, 4 : nonpreemptive scheduling 2, 3 : preemptive scheduling Many processes are in ready or waiting states in multiprogrammed operating systems.

Handling Preemption in Designing OS
Preemptive Scheduling scheme causes the following: Cost associated with access to shared data by two processes: inconsistent data view Affect the design of the OS kernel: A process preempted in the middle of changes on important kernel data (I/O queues) causes inconsistent data view to a kernel trying to read or modify the same data. By waiting either for a system call to complete or for an I/O block to take place before context switch When OS needs to accept interrupts: By disabling interrupts at entry and re-enabling them at exit

Dispatcher Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: switching context switching to user mode jumping to the proper location in the user program to restart that program Dispatch latency – time it takes for the dispatcher to stop one process and start another running Dispatcher CPU Process Ready Running

Context Switch (from Chapter 3)
Dispatcher latency

 Scheduling Criteria Different CPU-scheduling algorithms have different properties. Scheduling Criteria CPU utilization – keep the CPU as busy as possible Throughput – # of processes that complete their execution per time unit Turnaround time – amount of time to execute a particular process (generally limited by the speed of the output device) Waiting time – amount of time a process has been waiting in the ready queue Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (for time-sharing environment) in an interactive system

Scheduling Algorithm Optimization Criteria
Max CPU utilization Max throughput Min turnaround time Min waiting time Min response time In most cases, the best policy is to optimize the average measure. For interactive systems, it is more important to minimize the variance in the response time than to minimize the average response time.

Scheduling Algorithms
 Scheduling Algorithms First-come, First-Served Scheduling Shortest-Job-First Scheduling Priority Scheduling Round-Robin Scheduling Multilevel Queue Scheduling

First-Come, First-Served (FCFS) Scheduling
 First-Come, First-Served (FCFS) Scheduling Process Burst Time P1 24 P2 3 P3 3 Suppose that the processes arrive in the order: P1 , P2 , P3 The Gantt Chart for the schedule is: Waiting time for P1 = 0; P2 = 24; P3 = 27 Average waiting time: ( )/3 = 17 The FCFS policy is easily implemented with a FIFO queue. Simple to write and understand The FCFS scheduling algorithm is non-preemptive. Release CPU only for terminating or requesting I/O Cause trouble for time-sharing systems P1 P2 P3 24 27 30

FCFS Scheduling (Cont.)
 FCFS Scheduling (Cont.) Suppose that the processes arrive in the order: P2 , P3 , P1 The Gantt chart for the schedule is: Waiting time for P1 = 6; P2 = 0; P3 = 3 Average waiting time: ( )/3 = 3 Much better than previous case Convoy effect - short process behind long process Consider one CPU-bound and many I/O-bound processes All the I/O-bound processes wait for the one big process (CPU-bound process) to get off the CPU. Results in lower CPU and device utilization P1 P3 P2 6 3 30

Shortest-Job-First (SJF) Scheduling
 Shortest-Job-First (SJF) Scheduling Associates each process with the length of its next CPU burst Use these lengths to schedule the process with the shortest time SJF is optimal – gives minimum average waiting time for a given set of processes The difficulty is to know the length of the next CPU request Could ask the user

Example of SJF  ProcessArriva l Time Burst Time P1 0.0 6 P2 2.0 8
SJF scheduling chart Average waiting time = ( ) / 4 = 7 P4 P3 P1 3 16 9 P2 24

Determining Length of Next CPU Burst
 Determining Length of Next CPU Burst Can only estimate the length – should be similar to the previous one Then pick process with shortest predicted next CPU burst Estimation can be done by using the length of previous CPU bursts, using exponential averaging tn : contains the most recent information n : stores the past history α : controls the relative weight of recent and past history in the prediction process. Commonly, set to ½ Preemptive SJF version is sometimes called shortest-remaining-time-first scheduling.

Prediction of the Length of the Next CPU Burst
 Prediction of the Length of the Next CPU Burst α set to ½ 8=½*6 + ½*10 6=½*4 + ½*8

Examples of Exponential Averaging
 Examples of Exponential Averaging  =0 n+1 = n = n-1 = n-2 … = 0 Recent history does not count (no effect): current conditions are assumed to be transient.  =1 n+1 =  tn Only the actual last CPU burst counts: history is assumed to be old and irrelevant. If we expand the formula, we get: n+1 =  tn + (1 - ) tn-1 + (1 - ) 2  tn-2 + … + (1 -  )j  tn -j + … + (1 -  )n t0 + (1 -  )n +1 0 Since both  and (1 - ) are less than or equal to 1, each successive term has less weight than its predecessor. Recent history is more influential to estimate the next CPU burst.

Example of Shortest-remaining-time-first
 Example of Shortest-remaining-time-first Now we add the concepts of varying arrival times and preemption to the analysis ProcessA arri Arrival TimeT Burst Time P1 0 8 P2 1 4 P3 2 9 P4 3 5 Preemptive SJF Gantt Chart Average waiting time = [(10-1)+(1-1)+(17-2)+(5-3)]/4 = 26/4 = 6.5 msec P1 P2 1 17 10 P3 26 5 P4

 Priority Scheduling A priority number (integer) is associated with each process The CPU is allocated to the process with the highest priority (smallest integer  highest priority) Preemptive Nonpreemptive SJF is priority scheduling where priority is the inverse of predicted next CPU burst time The larger the CPU burst, the lower the priority Problem  Starvation – low priority processes may never execute Solution  Aging – as time progresses increase the priority of the process

Example of Priority Scheduling
 Example of Priority Scheduling ProcessA arri Burst TimeT Priority P1 10 3 P2 1 1 P3 2 4 P4 1 5 P5 5 2 Priority scheduling Gantt Chart Average waiting time = [(6)+(0)+(16)+(18)+(1)]/5 = 41/5 = 8.2 msec P2 P3 P5 1 18 16 P4 19 6 P1

Round Robin (RR)  Especially for time-sharing systems
Each process gets a small unit of CPU time (time quantum q), usually milliseconds. After this time has elapsed, the process is preempted and added to the end of the ready queue. If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time in chunks of at most q time units at once. No process waits more than (n-1)*q time units until its next time quantum. Timer interrupts every quantum to schedule next process

Example of RR with Time Quantum = 4
 Process Burst Time P1 24 P2 3 P3 3 The Gantt chart is: P1 P2 P3 4 7 10 14 18 22 26 30

Time Quantum and Context Switch Time
 Time Quantum and Context Switch Time Performance: highly depending on the size of the time quantum q large  the same as the FIFO policy q small  processor sharing, resulting in a large number of context switches q must be large with respect to context switch, otherwise overhead is too high Typically, higher average turnaround than SJF, but better response In practice, q usually 10ms to 100ms, context switch < 10 msec

Turnaround Time Varies With The Time Quantum
 Turnaround Time Varies With The Time Quantum Turnaround time depends on the size of the time quantum. But, not necessarily improve as the time-quantum size increases In general, the average turnaround time can be improved if most processes finish their next CPU burst in a single time quantum. A rule of thumb: 80 percent of the CPU bursts should be shorter than the time quantum, q.

 Multilevel Queue Ready queue is partitioned into separate queues, when processes are easily classified into different groups with different response-time requirements, eg: foreground (interactive) processes : priority over background processes background (batch) processes Process permanently in a given queue: Multilevel Queue Scheduling Processes do not move from one queue to the other. Low scheduling overhead, but it is inflexible. Each queue has its own scheduling algorithm: foreground – RR background – FCFS Each queue has absolute priority over lower-priority queues. Scheduling must be done between the queues: Problem: Fixed priority scheduling; (i.e., serve all from foreground then from background). Possibility of starvation. Solutions: Time slice scheduling – each queue gets a certain amount of CPU time which it can schedule amongst its processes; i.e., 80% to foreground in RR scheduling, 20% to background in FCFS basis

Multilevel Queue Scheduling
 Multilevel Queue Scheduling Multilevel queue scheduling algorithm partitions the ready queue into several separate queues.

Multilevel Feedback Queue
 Multilevel Feedback Queue A process can move between the various queues; aging can be implemented this way Idea: separate processes according to the characteristics of their CPU bursts Process using too much CPU time : lower-priority queue I/O-bound and interactive processes : higher-priority queue Aging: process that waits too long in a lower-priority queue may be moved to a higher-priority queue. Multilevel-feedback-queue scheduler defined by the following parameters: number of queues scheduling algorithms for each queue method used to determine when to upgrade a process method used to determine when to downgrade a process method used to determine which queue a process will enter when that process needs service

Example of Multilevel Feedback Queue
 Q0 : FCFS with 8 msec Three queues: Q0 – RR with time quantum 8 milliseconds Q1 – RR with time quantum 16 milliseconds Q2 – FCFS Scheduling A new job enters queue Q0 which is served FCFS When it gains CPU, job receives 8 milliseconds If it does not finish in 8 milliseconds, job is moved to queue Q1 At Q1 job is again served FCFS and receives 16 additional milliseconds If it still does not complete, it is preempted and moved to queue Q2 Q1 : FCFS with 16 msec Q2 : FCFS

 Thread Scheduling In chapter 4, distinction between user-level and kernel-level threads When kernel-level threads supported, threads (not processes) are scheduled by the OS. Many-to-one and many-to-many models, thread library schedules user-level threads to run on available LWP. Scheduling user threads by the thread library does not mean that the threads are actually running on a CPU. A scheme known as process-contention scope (PCS), since scheduling competition is within the same process Typically done via priority set by programmer Kernel thread scheduled onto available CPU is system-contention scope (SCS) Competition among all threads in the system Systems using the one-to-one model (Windows, Linux, and Solaris) schedule threads using only SCS. One distinction between user-level and kernel-level threads lies in how they are scheduled.

 Pthread Scheduling API allows specifying either PCS or SCS during thread creation PTHREAD_SCOPE_PROCESS schedules threads using PCS scheduling. PTHREAD_SCOPE_SYSTEM schedules threads using SCS scheduling On systems implementing the many-to-many mode: PTHREAD_SCOPE_PROCESS policy schedules user-level threads onto available LWPs PTHREAD_SCOPE_SYSTEM policy will crate and bind an LWP for each user-level thread on many-to-many systems, effectively mapping threads using the one-to-one policy Can be limited by OS: Linux and Mac OS X only allow PTHREAD_SCOPE_SYSTEM

Pthread Scheduling API
#include <pthread.h> #include <stdio.h> #define NUMB_THREADS 5 int main(int argc, char *argv[]) { int i, scope; pthread_t tid[NUMB_THREADS]; pthread_attr_t attr; /* get the default attributes */ pthread_attr_init(&attr); /* first inquire on the current scope */ if (pthread_attr_getscope(&attr, &scope) != 0) fprintf(stderr, "Unable to get scheduling scope\n"); else { if (scope == PTHREAD_SCOPE _PROCESS) printf("PTHREAD_SCOPE_PROCESS"); else if (scope == PTHREAD_SCOPE_SYSTEM) printf("PTHREAD_SCOPE_SYSTEM"); else fprintf(stderr, "Illegal scope value.\n"); }

Pthread Scheduling API
/* set the scheduling algorithm to PCS or SCS */ pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); /* create the threads */ for (i = 0; i < NUMB_THREADS; i++) pthread_create(&tid[i],&attr,runner,NULL); /* now join on each thread */ for (i = 0; i < NUM THREADS; i++) pthread_join(tid[i], NULL); } /* Each thread will begin control in this function */ void *runner(void *param) { /* do some work ... */ pthread exit(0);

Multiple-Processor Scheduling
Load sharing becomes possible; however, CPU scheduling more complex when multiple CPUs are available Even with homogeneous processors, limitations on scheduling in case of a specific processor with an I/O device attached to its private bus Asymmetric multiprocessing – only one processor (the master server) has all scheduling decisions, I/O processing, and other system activities.  simple, less data sharing need. Symmetric multiprocessing (SMP) – each processor is self-scheduling, all processes in a common ready queue, or each has its own private queue of ready processes The scheduler of each processor examine and select a process: currently, most common Processor affinity – process has affinity for processor on which it is currently running due to migration cost regarding cache memory handling soft affinity : not guaranteeing that a process will keep the same processor hard affinity : by providing system calls that allow a process to specify that it is not to migrate to other processors (Linux)  sched_setaffinity() system call Variations within processor sets, limiting which processes can run on which CPUs. (Solaris)

NUMA and CPU Scheduling
Note that main-memory architecture of a system can also consider processor affinity. Co-work between CPU scheduler and memory-placement algorithm is required.

Multiple-Processor Scheduling – Load Balancing
Need to keep the workload balanced across all processors in an SMP system. Load balancing attempts to keep workload evenly distributed Push migration – a specific task periodically checks the load on each processor, and if it finds an imbalance, pushes task from overloaded CPU to idle or less-busy processors Pull migration – occurs when an idle processor pulls a waiting tasks from a busy processor Both are often implemented in parallel on load-balancing systems.

Multicore Processors 
Recent trend to place multiple processor cores on the same physical chip  an multicore processor Each core has a register set to maintain its architectural state. SMP systems with multicore processors are faster and consumes less power Multicore processors complicate scheduling issues: Significant amount of time waiting for the data to become available from memory : Memory Stall (delay) Due to a cache miss (accessing data that is not in cache memory) Multithreaded processor cores in which two (ore more) hardware threads are assigned to each core. Takes advantage of memory stall to make progress on another thread while memory retrieval happens : Interleaving multithreads on a core

Multithreaded Multicore System
 Multithreaded Multicore System Dual-threaded process cores By Interleaving : Multithreaded multicore system

Virtualization and Scheduling
Virtualization software (Virtual Machine) schedules multiple guests onto CPU(s) Each guest doing its own scheduling Not knowing it doesn’t own the CPUs Can result in poor response time Incorrect time-of-day clocks in guests Can undo good scheduling algorithm efforts of guests

Real-Time CPU Scheduling
 Real-Time CPU Scheduling Can present obvious challenges Soft real-time systems – no guarantee as to when critical real-time process will be scheduled Hard real-time systems – task must be serviced by its deadline Event latency – the amount of time from when an event occurs to when it is serviced Two types of latencies affecting performance of real-time systems: Interrupt latency – time from arrival of interrupt to start of routine (ISR) that services interrupt Dispatch latency – time for schedule to take current process off CPU (stop one process) and switch to another (start another)

Real-Time CPU Scheduling (Cont.)
 Real-Time CPU Scheduling (Cont.) Conflict phase has two components: Preemption of any process running in kernel mode Release by low-priority process of resources needed by high-priority processes stop one process and start another

Priority-Based Scheduling
 Priority-Based Scheduling For real-time scheduling, scheduler must support a priority-based scheduling algorithm with preemption But only guarantees soft real-time For hard real-time must also provide ability to meet deadlines Processes have new characteristics: a periodic process requires CPU at constant intervals (periods) the process has a fixed processing time t, a deadline d, period p 0 ≤ t ≤ d ≤ p the rate of a periodic task is 1/p the execution of a periodic process over time

Rate Monotonic Scheduling
 Rate Monotonic Scheduling Rate-monotonic scheduling algorithm schedules periodic tasks using a static priority policy with preemption. Each periodic task is assigned a priority inversely based on its period Shorter periods = higher priority Longer periods = lower priority P1 is assigned a higher priority than P2 : new period

Missing Deadlines with Rate Monotonic Scheduling
due to static priority assignment

Earliest Deadline First Scheduling (EDF)
 Earliest Deadline First Scheduling (EDF) Priorities are dynamically assigned according to deadlines: the earlier the deadline, the higher the priority the later the deadline, the lower the priority Unlike the Rate-Monotonic scheduling: Processes: no need to be periodic, nor a constant amount of CPU time per burst But, a process needs to announce its deadline to the scheduler when runnable dynamic priority assignment according to deadlines

Proportional Share Scheduling
 Proportional Share Scheduling T shares are allocated among all processes in the system An application receives N shares where N < T This ensures each application will receive N / T of the total processor time Admission controller denies new process(es) into the system when sufficient shares are not available

POSIX Real-Time Scheduling
The POSIX.1b standard POSIX API provides functions for managing real-time threads Defines two scheduling classes for real-time threads: SCHED_FIFO - threads are scheduled using a FCFS strategy with a FIFO queue. There is no time-slicing for threads of equal priority SCHED_RR - similar to SCHED_FIFO except time-slicing occurs for threads of equal priority Defines two functions for getting and setting scheduling policy: pthread_attr_getsched_policy(pthread_attr_t *attr, int *policy) pthread_attr_setsched_policy(pthread_attr_t *attr, int policy)

POSIX Real-Time Scheduling API
#include <pthread.h> #include <stdio.h> #define NUMB_THREADS 5 int main(int argc, char *argv[]) { int i, policy; pthread_t tid[NUM THREADS]; pthread_attr_t attr; /* get the default attributes */ pthread_attr_init(&attr); /* get the current scheduling policy */ if (pthread_attr_getsched_policy(&attr, &policy) != 0) fprintf(stderr, "Unable to get policy.\n"); else { if (policy == SCHED_OTHER) printf("SCHED_OTHER\n"); else if (policy == SCHED_RR) printf("SCHED_RR\n"); else if (policy == SCHED_FIFO) printf("SCHED_FIFO\n"); }

POSIX Real-Time Scheduling API (Cont.)
/* set the scheduling policy - FIFO, RR, or OTHER */ if (pthread_attr_setsched_policy(&attr, SCHED_FIFO) != 0) fprintf(stderr, "Unable to set policy.\n"); /* create the threads */ for (i = 0; i < NUMB_THREADS; i++) pthread_create(&tid[i],&attr,runner,NULL); /* now join on each thread */ for (i = 0; i < NUMB_THREADS; i++) pthread_join(tid[i], NULL); } /* Each thread will begin control in this function */ void *runner(void *param) { /* do some work ... */ pthrea_ exit(0);

Operating System Examples
Linux scheduling Windows XP scheduling Solaris scheduling

Linux Scheduling Through Ver. 2.5
 Linux Scheduling Through Ver. 2.5 Prior to kernel version 2.5, ran variation of standard UNIX scheduling algorithm After Ver. 2.5, constant order O(1) scheduling time Preemptive, priority based algorithm Two priority ranges: time-sharing and real-time Real-time range from 0 to 99 and nice value from 100 to 140 Map into global priority with numerically lower values indicating higher priority Unlike schedulers for many other systems, higher priority gets larger q Task runnable as long as time left in time slice (active) If no time left (expired), not runnable until all other tasks use their slices All runnable tasks tracked in per-CPU runqueue data structure Two priority arrays (active, expired) Tasks indexed by priority When no more active, arrays are exchanged Worked well, but poor response times for interactive processes

Linux Scheduling in Version 2.6.23 +
 Linux Scheduling in Version Since Ver , Completely Fair Scheduler (CFS) Based on Scheduling classes Each has specific priority: different scheduling algorithms per class Scheduler picks highest priority task in highest scheduling class Two scheduling classes included, others can be added a default scheduling class using the CFS scheduling algorithm a real-time scheduling class Rather than quantum based on fixed time allotments, based on proportion of CPU time Quantum calculated based on nice value from -20 to +19 Lower value is higher priority Calculates target latency – interval of time during which task should run at least once Target latency can increase if the number of active tasks increases Rather than directly assigning priorities, CFS scheduler maintains per task virtual run time in variable vruntime Associated with a decay factor based on priority of task – lower priority is higher decay rate Normal priority tasks (nice values of 0) are assigned virtual run time = actual run time To decide next task to run, scheduler picks task with lowest virtual run time

CFS Performance Only tasks runnable are added to the balanced binary search tree whose key is based on the value of vruntime.

Linux Scheduling (Cont.)
Real-time scheduling according to POSIX standard. SCHED_FIFO SCHED_RR Two separate priority ranges in Linux: real-time tasks: real-time tasks have static priorities (0-99) normal tasks ( ) Two ranges map into a global priority scheme: the lower the values, the higher relative priorities Normal tasks: based on their nice values Nice value of -20 maps to global priority 100 Nice value of +19 maps to priority 139

Windows Scheduling Windows uses priority-based preemptive scheduling
Highest-priority thread runs next Dispatcher is scheduler Thread runs until (1) blocks, (2) uses time slice, (3) preempted by higher-priority thread Real-time threads can preempt non-real-time 32-level priority scheme Variable class is 1-15, real-time class is 16-31 Priority 0 is memory-management thread Queue for each priority If no ready thread is found, runs a special thread called idle thread

Windows Priority Classes
Win32 API identifies several priority classes to which a process can belong REALTIME_PRIORITY_CLASS, HIGH_PRIORITY_CLASS, ABOVE_NORMAL_PRIORITY_CLASS, NORMAL_PRIORITY_CLASS, BELOW_NORMAL_PRIORITY_CLASS, IDLE_PRIORITY_CLASS All are variable except REALTIME A thread within a given priority class has a relative priority TIME_CRITICAL HIGHEST ABOVE_NORMAL NORMAL BELOW_NORMAL LOWEST IDLE

Windows Threads Priorities
 Windows Threads Priorities Priority classes Relative priority Base priority

Windows Priority Classes (cont.)
Priority class and relative priority combine to give numeric priority Base priority is NORMAL within the class If quantum expires, then interrupted:  priority lowered, but never below base priority If wait occurs:  priority boosted depending on what was waited for Interactive threads:  get higher priority than disk operation thread Foreground window:  given 3x priority boost

Solaris Priority-based thread scheduling where each thread belongs to one of six scheduling classes: Time sharing (default) : TS Interactive : IA Real time : RT System : SYS Fair Share : FSS Fixed priority : FP Given thread can be in one class at a time Each class has its own scheduling algorithm The default scheduling class for a process is time sharing. Time sharing with varying priorities and time slices of different lengths using multi-level feedback queue The higher the priority, the smaller the time slice: inverse relationship between priorities and time slices.

Solaris Dispatch Table for time-sharing and interactive threads
Lowered new priority after expiration Boosted new priority after sleep to provide good response time for interactive processes Lowest priority with highest time quantum Highest priority with lowest time quantum

Solaris Scheduling Threads in the real-time class are given the highest priority and will run before a process in any other class. Threads in the fixed-priority have the same priority range as those in the time-sharing class; however, their priorities are not dynamically adjusted. The fair-share scheduling class uses CPU shares instead of priorities to make scheduling decisions. The scheduler selects the thread with the highest global priority to run after converting the class-specific priorities into global priorities. 10 threads for servicing interrupts

Solaris Scheduling (Cont.)
 Solaris Scheduling (Cont.) Scheduler converts class-specific priorities into a per-thread global priority Thread with highest priority runs next Runs until (1) blocks, (2) uses time slice, (3) preempted by higher-priority thread Multiple threads at same priority selected via RR scheme

Algorithm Evaluation How to select CPU-scheduling algorithm for an OS?
Determine criteria, then evaluate algorithms Deterministic modeling Type of analytic evaluation Takes a particular predetermined workload and defines the performance of each algorithm for that workload Consider 5 processes arriving at time 0

Deterministic Evaluation
For each algorithm, calculate minimum average waiting time Simple and fast, but requires exact numbers for input, applies only to those inputs FCFS is 28ms: Non-preemptive SJF is 13ms: RR is 23ms:

Queueing Models There is no static set of processes(or times) to use for deterministic modeling. In most cases, the distribution of CPU and I/O burst is available. Describes the arrival of processes, and CPU and I/O bursts probabilistically Commonly exponential, and described by mean Computes average throughput, utilization, waiting time, etc. Computer system described as network of servers, each with queue of waiting processes Knowing arrival rates and service rates  Estimate utilization, average queue length, average wait time, etc. Queueing-network analysis

Little’s Formula Little’s Formula n = λ x W n = average queue length
W = average waiting time in queue λ = average arrival rate into queue Little’s law – in steady state, processes leaving queue must be equal to processes arriving, thus Valid for any scheduling algorithm and arrival distribution For example, if on average 7 processes arrive per second, and normally 14 processes in queue, then average wait time per process = 2 seconds

Simulations Queueing models limited Simulations more accurate
Programmed model of computer system Clock is a variable Gather statistics indicating algorithm performance Data to drive simulation gathered via Random number generator according to probabilities Distributions defined mathematically or empirically Trace tapes record sequences of real events in real systems

Evaluation of CPU Schedulers by Simulation

Implementation Even simulations have limited accuracy
Just implement new scheduler and test in real systems High cost, high risk Environments vary Most flexible schedulers can be modified per-site or per-system Or APIs to modify priorities But again environments vary

Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling Rev. by Kyungeun Park, 2015.

Similar presentations

Presentation on theme: "Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling Rev. by Kyungeun Park, 2015."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling Rev. by Kyungeun Park, 2015.

Similar presentations

Presentation on theme: "Review for Exam II Chapter 5: Process Synchronization Chapter 6: CPU Scheduling Rev. by Kyungeun Park, 2015."— Presentation transcript:

Similar presentations

About project

Feedback