Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

Similar presentations


Presentation on theme: "Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒."— Presentation transcript:

1 Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒

2 22 Outline of Content  3.1. Critical Sections, Semaphores, and Monitors  3.2. Windows Trap Dispatching, Interrupts, Synchronization  3.3. Advanced Windows Synchronization  3.4. Windows APIs for Synchronization and IPC

3 33 Critical Sections, Semaphores, and Monitors  The Critical-Section Problem  Software Solutions  Synchronization Hardware  Semaphores  Synchronization in Windows & Linux

4 44 The Critical-Section Problem  n threads all competing to use a shared resource  Each thread has a code segment, called critical section, in which the shared data is accessed  Problem: Ensure that: –when one thread is executing in its critical section, no other thread is allowed to execute in its critical section

5 55 Solution to Critical-Section Problem  Mutual Exclusion –Only one thread at a time is allowed into its CS, among all threads that have CS for the same resource or shared data –A thread halted in its non-critical section must not interfere with other threads  Progress –A thread remains inside CS for a finite time only –No assumptions concerning relative speed of the threads

6 66 Solution to Critical-Section Problem  Bounded Waiting –It must no be possible for a thread requiring access to a critical section to be delayed indefinitely –When no thread is in a critical section, any thread that requests entry must be permitted to enter without delay

7 77  Only 2 threads, T 0 and T 1  General structure of thread T i (other thread T j ) do { enter section critical section exit section reminder section } while (1);  Threads may share some common variables to synchronize their actions Initial Attempts to Solve Problem

8 88 First Attempt: Algorithm 1  Shared variables –Initialization: int turn = 0; –turn == i  T i can enter its critical section  Thread T i do { while (turn != i) ; critical section turn = j; reminder section } while (1);  Satisfies mutual exclusion, but not progress

9 99 Second Attempt: Algorithm 2  Shared variables –initialization: int flag[2]; flag[0] = flag[1] = 0; –flag[i] == 1  T i can enter its critical section  Thread T i do { flag[i] = 1; while (flag[j] == 1) ; critical section flag[i] = 0; remainder section } while(1);  Satisfies mutual exclusion, not progress requirement

10 10 Algorithm 3 (Peterson’s Algorithm - 1981)  Shared variables of algorithms 1 and 2 - initialization: int flag[2]; flag[0] = flag[1] = 0; int turn = 0;  Thread T i do { flag[i] = 1; turn = j; while ((flag[j] == 1) && turn == j) ; critical section flag[i] = 0; remainder section } while (1);  Solves the critical-section problem for two threads

11 11 Dekker’s Algorithm (1965)  This is the first correct solution proposed for the two-thread (two-process) case  Originally developed by Dekker in a different context, it was applied to the critical section problem by Dijkstra.  Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested.  When there is a conflict, one thread is favored, and the priority reverses after successful execution of the critical section

12 12 Dekker’s Algorithm (contd.)  Shared variables - initialization: int flag[2]; flag[0] = flag[1] = 0; int turn = 0;  Thread T i do { flag[i] = 1; while (flag[j] ) if (turn == j) { flag[i] = 0; while (turn == j); flag[i] = 1; } critical section turn = j; flag[I] = 0;; remainder section } while (1);

13 13 Bakery Algorithm (Lamport 1979)  A Solution to the Critical Section problem for n threads  Before entering its CS, a thread receives a number  Holder of the smallest number enters the CS.  If threads T i and T j receive the same number, if i < j, then T i is served first; else T j is served first.  The numbering scheme generates numbers in monotonically non-decreasing order: –i.e., 1,1,1,2,3,3,3,4,4,5...

14 14 Bakery Algorithm  Notation “<“ establishes lexicographical order among 2-tuples (ticket #, thread id #) (a,b) < (c,d) if a < c or if a == c and b < d max (a 0,…, a n-1 ) = { k | k  a i for i = 0,…, n – 1 }  Shared data int choosing[n]; int number[n]; - the ticket Data structures are initialized to 0

15 15 Bakery Algorithm do { choosing[i] = 1; number[i] = max(number[0],number[1]...,number[n-1]) + 1; choosing[i] = 0; for (j = 0; j < n; j++) { while (choosing[j] == 1) ; while ((number[j] != 0) && ((number[j],j) ‘’ < ‘’ (number[i],i))); } critical section number[i] = 0; remainder section } while (1);

16 16 Mutual Exclusion - Hardware Support  Interrupt Disabling –Concurrent threads cannot overlap on a uniprocessor –Thread will run until performing a system call or interrupt happens  Special Atomic Machine Instructions –Test and Set Instruction - read & write a memory location –Exchange Instruction - swap register and memory location  Problems with Machine-Instruction Approach –Busy waiting –Starvation is possible –Deadlock is possible

17 17 Synchronization Hardware  Test and modify the content of a word atomically boolean TestAndSet(boolean &target) { boolean rv = target; target = true; return rv; }

18 18  Shared data: –boolean lock = false;  Thread T i do { while (TestAndSet(lock)) ; critical section lock = false; remainder section } Mutual Exclusion with Test-and-Set

19 19 Synchronization Hardware  Atomically swap two variables void Swap(boolean &a, boolean &b) { boolean temp = a; a = b; b = temp; }

20 20 Mutual Exclusion with Swap  Shared data (initialized to 0): int lock = 0;  Thread T i int key; do { key = 1; while (key == 1) Swap(lock,key); critical section lock = 0; remainder section }

21 21 Semaphores  Semaphore S – integer variable  can only be accessed via two atomic operations wait (S): while (S <= 0); S--; signal (S): S++;

22 22 Critical Section of n Threads  Shared data: semaphore mutex; //initially mutex = 1  Thread T i : do { wait(mutex); critical section signal(mutex); remainder section } while (1);

23 23 Semaphore Implementation  Semaphores may suspend/resume threads –Avoid busy waiting  Define a semaphore as a record typedef struct { int value; struct thread *L; } semaphore;  Assume two simple operations: –suspend() suspends the thread that invokes it –resume(T) resumes the execution of a blocked thread T

24 24 Implementation  Semaphore operations now defined as wait(S): S.value--; if (S.value < 0) { add this thread to S.L; suspend(); } signal(S): S.value++; if (S.value <= 0) { remove a thread T from S.L; resume(T); }

25 25 Semaphore as a General Synchronization Tool  Execute B in T j only after A executed in T i  Use semaphore flag initialized to 0  Code: TiTjTiTj     Await(flag) signal(flag)B

26 26 Two Types of Semaphores  Counting semaphore –integer value can range over an unrestricted domain.  Binary semaphore –integer value can range only between 0 and 1; – can be simpler to implement.  Counting semaphore S can be implemented as a binary semaphore

27 27 Deadlock and Starvation  Deadlock – –two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads  Let S and Q be two semaphores initialized to 1 T0T1T0T1 wait(S);wait(Q); wait(Q);wait(S);  signal(S);signal(Q); signal(Q)signal(S);

28 28 Deadlock and Starvation  Starvation – indefinite blocking –A thread may never be removed from the semaphore queue in which it is suspended.  Solution – –all code should acquire/release semaphores in same order

29 29 Windows Synchronization  Uses interrupt masks to protect access to global resources on uniprocessor systems.  Uses spinlocks on multiprocessor systems.  Provides dispatcher objects which may act as mutexes and semaphores.  Dispatcher objects may also provide events. An event acts much like a condition variable

30 30 Linux Synchronization  Kernel disables interrupts for synchronizing access to global data on uniprocessor systems.  Uses spinlocks for multiprocessor synchronization.  Uses semaphores and readers-writers locks when longer sections of code need access to data.  Implements POSIX synchronization primitives to support multitasking, multithreading (including real-time threads), and multiprocessing

31 31 Further Reading  Ben-Ari, M., Principles of Concurrent Programming, Prentice Hall, 1982  Lamport, L., The Mutual Exclusion Problem, Journal of the ACM, April 1986  Abraham Silberschatz, Peter B. Galvin, Operating System Concepts, John Wiley & Sons, 6th Ed., 2003 ; –Chapter 7 - Process Synchronization –Chapter 8 - Deadlocks

32 32 3.2. Trap Dispatching, Interrupts, Synchronization  Trap and Interrupt dispatching  IRQL levels & Interrupt Precedence  Spinlocks and Kernel Synchronization  Executive Synchronization

33 33 Kernel Mode Versus User Mode  A processor state  Controls access to memory  Each memory page is tagged to show the required mode for reading and for writing –Protects the system from the users –Protects the user (process) from themselves –System is not protected from system  Code regions are tagged “no write in any mode”  Controls ability to execute privileged instructions  A Windows abstraction –Intel: Ring 0, Ring 3

34 34 Kernel Mode Versus User Mode  Control flow (a thread) can change from user to kernel mode and back –Does not affect scheduling –Thread context includes info about execution mode (along with registers, etc)  PerfMon counters: –“Privileged Time” and “User Time” –4 levels of granularity: thread, process, processor, system

35 35 Getting Into Kernel Mode  Code is run in kernel mode for one of three reasons:  1. Requests from user mode –Via the system service dispatch mechanism –Kernel-mode code runs in the context of the requesting thread  2. Dedicated kernel-mode system threads –Some threads in the system stay in kernel mode at all times mostly in the “System” process –Scheduled, preempted, etc., like any other threads

36 36 Getting Into Kernel Mode 3. Interrupts from external devices –interrupt dispatcher invokes the interrupt service routine –ISR runs in the context of the interrupted thread so-called “arbitrary thread context” –ISR often requests the execution of a “DPC routine,” which also runs in kernel mode –Time not charged to interrupted thread

37 37 Trap dispatching Interrupt dispatcher System service dispatcher Interrupt service routines System services Exception dispatcher Exception handlers Virtual memory manager‘s pager Interrupt System service call HW exceptions SW exceptions Virtual address exceptions  Trap: processor‘s mechanism to capture executing thread –Switch from user to kernel mode –Interrupts – asynchronous –Exceptions - synchronous

38 Interrupt dispatch routine Disable interrupts Record machine state (trap frame) to allow resume Mask equal- and lower-IRQL interrupts Find and call appropriate ISR Dismiss interrupt Restore machine state (including mode and enabled interrupts) Disable interrupts Record machine state (trap frame) to allow resume Mask equal- and lower-IRQL interrupts Find and call appropriate ISR Dismiss interrupt Restore machine state (including mode and enabled interrupts) Tell the device to stop interrupting Interrogate device state, start next operation on device, etc. Request a DPC Return to caller Tell the device to stop interrupting Interrogate device state, start next operation on device, etc. Request a DPC Return to caller Interrupt service routine interrupt ! user or kernel mode code kernel mode Note, no thread or process context switch! Interrupt Dispatching

39 39  IRQL = Interrupt Request Level –Precedence of the interrupt with respect to other interrupts –Different interrupt sources have different IRQLs –not the same as IRQ  IRQL is also a state of the processor –Servicing an interrupt raises processor IRQL to that interrupt’s IRQL –this masks subsequent interrupts at equal and lower IRQLs  User mode is limited to IRQL 0  No waits or page faults at IRQL >= DISPATCH_LEVEL Interrupt Precedence via IRQLs (x86)

40 40 Passive/Low APC Dispatch/DPC Device 1... Profile & Synch (Srv 2003) Clock Interprocessor Interrupt Power fail High normal thread execution Hardware interrupts Deferrable software interrupts 0 1 2 30 29 28 31 Interrupt Precedence via IRQLs (x86)

41 41 Interrupt processing  Interrupt dispatch table (IDT) –Links to interrupt service routines  x86: –Interrupt controller interrupts processor (single line) –Processor queries for interrupt vector; uses vector as index to IDT  Alpha: –PAL code (Privileged Architecture Library – Alpha BIOS) determines interrupt vector, calls kernel –Kernel uses vector to index IDT  After ISR execution, IRQL is lowered to initial level

42 42 Interrupt object  Allows device drivers to register ISRs for their devices –Contains dispatch code (initial handler) –Dispatch code calls ISR with interrupt object as parameter (HW cannot pass parameters to ISR)  Connecting/disconnecting interrupt objects: –Dynamic association between ISR and IDT entry –Loadable device drivers (kernel modules) –Turn on/off ISR  Interrupt objects can synchronize access to ISR data –Multiple instances of ISR may be active simultaneously (MP machine) –Multiple ISR may be connected with IRQL

43 43 Predefined IRQLs  High –used when halting the system (via KeBugCheck())  Power fail –originated in the NT design document, but has never been used  Inter-processor interrupt –used to request action from other processor (dispatching a thread, updating a processors TLB, system shutdown, system crash)  Clock –Used to update system‘s clock, allocation of CPU time to threads  Profile –Used for kernel profiling (see Kernel profiler – Kernprof.exe, Res Kit)

44 44 Predefined IRQLs (contd.)  Device –Used to prioritize device interrupts  DPC/dispatch and APC –Software interrupts that kernel and device drivers generate  Passive –No interrupt level at all, normal thread execution

45 45 IRQLs on 64-bit Systems Passive/Low APC Dispatch/DPC Device 1.. Device n Synch (Srv 2003) Clock Interprocessor Interrupt/Power High/Profile 0 1 2 14 13 15 3 4 Passive/Low APC Dispatch/DPC & Synch (UP only) Correctable Machine Check Device 1. Device n Synch (MP only) Clock Interprocessor Interrupt High/Profile/Power x64 IA64 12

46 46 Interrupt Prioritization & Delivery  IRQLs are determined as follows: –x86 UP systems: IRQL = 27 - IRQ –x86 MP systems: bucketized (random) –x64 & IA64 systems: IRQL = IDT vector number / 16  On MP systems, which processor is chosen to deliver an interrupt? –By default, any processor can receive an interrupt from any device Can be configured with IntFilter utility in Resource Kit –On x86 and x64 systems, the IOAPIC (I/O advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL –On IA64 systems, the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt source Processors are assigned round robin for each interrupt vector

47 47 Software interrupts  Initiating thread dispatching –DPC allow for scheduling actions when kernel is deep within many layers of code –Delayed scheduling decision, one DPC queue per processor  Handling timer expiration  Asynchronous execution of a procedure in context of a particular thread  Support for asynchronous I/O operations

48 48 Flow of Interrupts

49 49  Sync on MP use spinlocks to coordinate among processors  Spinlock acquisition and release routines implement a one- owner-at-a-time algorithm –Spinlock is either free or is considered to be owned by a CPU –Analogous to using Windows API mutexes from user mode  A spinlock is just a data cell in memory –Accessed with a test-and-modify operation that is atomic across all processors –KSPIN_LOCK is an opaque data type, typedef’d as a ULONG –To implement synchronization, a single bit is sufficient Synchronization on SMP Systems

50 50 do acquire_spinlock(DPC) until (SUCCESS) begin remove DPC from queue end release_spinlock(DPC) Kernel Synchronization Processor B Processor A............ Critical section spinlock DPC A spinlock is a locking primitive associated with a global data structure, such as the DPC queue

51 51 Queued Spinlocks  Problem: Checking status of spinlock via test-and-set operation creates bus contention  Queued spinlocks maintain queue of waiting processors  First processor acquires lock; other processors wait on processor-local flag –Thus, busy-wait loop requires no access to the memory bus  When releasing lock, the 1st processor’s flag is modified –Exactly one processor is being signaled –Pre-determined wait order

52 52 SMP Scalability Improvements  Windows 2000: queued spinlocks –!qlocks in Kernel Debugger  Server 2003: –More spinlocks eliminated (context swap, system space, commit) –Further reduction of use of spinlocks & length they are held –Scheduling database now per-CPU Allows thread state transitions in parallel

53 53 SMP Scalability Improvements  XP/2003: –Minimized lock contention for hot locks PFN or Page Frame Database lock –Some locks completely eliminated Charging nonpaged/paged pool quotas, allocating and mapping system page table entries, charging commitment of pages, allocating/mapping physical memory through AWE functions –New, more efficient locking mechanism (pushlocks) Doesn’t use spinlocks when no contention Used for object manager and address windowing extensions (AWE) related locks

54 54 Waiting  Flexible wait calls –Wait for one or multiple objects in one call –Wait for multiple can wait for “any” one or “all” at once “All”: all objects must be in the signalled state concurrently to resolve the wait –All wait calls include optional timeout argument –Waiting threads consume no CPU time

55 55 Waiting  Waitable objects include: –Events (may be auto-reset or manual reset; may be set or “pulsed”) –Mutexes (“mutual exclusion”, one-at-a-time) –Semaphores (n-at-a-time) –Timers –Processes and Threads (signalled upon exit or terminate) –Directories (change notification)

56 56 Waiting  No guaranteed ordering of wait resolution –If multiple threads are waiting for an object, and only one thread is released (e.g. it’s a mutex or auto-reset event), which thread gets released is unpredictable

57 57 Executive Synchronization Thread waits on an object handle Create and initialize thread object Initialized Ready Transition Waiting Running Terminated Standby Wait is complete; Set object to signaled state Interaction with thread scheduling  Waiting on Dispatcher Objects – outside the kernel

58 58 Interaction bet Synchronization & Dispatching  User mode thread waits on an event object‘s handle  Kernel changes thread‘s scheduling state from ready to waiting and adds thread to wait-list  Another thread sets the event  Kernel wakes up waiting threads; variable priority threads get priority boost

59 59 Interaction bet Synchronization & Dispatching  Dispatcher re-schedules new thread – it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch  If no processor can be preempted, the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later

60 60 What signals an object? Dispatcher object System events and resulting state change Effect of signaled state on waiting threads nonsignaledsignaled Owning thread releases mutex Resumed thread acquires mutex Kernel resumes one waiting thread Mutex (kernel mode) nonsignaledsignaled Owning thread or other thread releases mutex Resumed thread acquires mutex Kernel resumes one waiting thread Mutex (exported to user mode) nonsignaledsignaled One thread releases the semaphore, freeing a resource A thread acquires the semaphore. More resources are not available Kernel resumes one or more waiting threads Semaphore

61 61 A thread reinitializes the thread object What signals an object? (contd.) Dispatcher objectSystem events and resulting state change Effect of signaled state on waiting threads nonsignaledsignaled A thread sets the event Kernel resumes one or more threads Kernel resumes one or more waiting threads Event nonsignaledsignaled Dedicated thread sets one event in the event pair Kernel resumes the other dedicated thread Kernel resumes waiting dedicated thread Event pair nonsignaledsignaled Timer expires A thread (re) initializes the timer Kernel resumes all waiting threads Timer nonsignaledsignaled Thread terminates Kernel resumes all waiting threads Thread

62 62 Further Reading  Mark E. Russinovich and David A. Solomon, Microsoft Windows Internals, 4th Edition, Microsoft Press, 2004.  Chapter 3 - System Mechanisms –Trap Dispatching (pp. 85 ff.) –Synchronization (pp. 149 ff.) –Kernel Event Tracing (pp. 175 ff.)

63 63 3.3. Advanced Windows Synchronization  Deferred and Asynchronous Procedure Calls  IRQLs and CPU Time Accounting  Wait Queues & Dispatcher Objects

64 64  Used to defer processing from higher (device) interrupt level to a lower (dispatch) level –Also used for quantum end and timer expiration  Driver (usually ISR) queues request –One queue per CPU. DPCs are normally queued to the current processor, but can be targeted to other CPUs –Executes specified procedure at dispatch IRQL (or “dispatch level”, also “DPC level”) when all higher-IRQL work (interrupts) completed –Maximum times recommended: ISR: 10 usec, DPC: 25 usec See http://www.microsoft.com/whdc/driver/perform/mmdrv.mspx http://www.microsoft.com/whdc/driver/perform/mmdrv.mspx Deferred Procedure Calls (DPCs)

65 65 queue head DPC object Deferred Procedure Calls (DPCs)

66 66 DPC Delivering a DPC DPC routines can call kernel functions but can‘t call system services, generate page faults, or create or wait on objects DPC routines can‘t assume what process address space is currently mapped Interrupt dispatch table high Power failure Dispatch/DPC APC Low DPC 1. Timer expires, kernel queues DPC that will release all waiting threads Kernel requests SW int. DPC DPC queue 2. DPC interrupt occurs when IRQL drops below dispatch/DPC level dispatcher 3. After DPC interrupt, control transfers to thread dispatcher 4. Dispatcher executes each DPC routine in DPC queue

67 67 Asynchronous Procedure Calls (APCs)  Execute code in context of a particular user thread –APC routines can acquire resources (objects), incur page faults, call system services  APC queue is thread-specific  User mode & kernel mode APCs –Permission required for user mode APCs

68 68 Asynchronous Procedure Calls (APCs)  Executive uses APCs to complete work in thread space –Wait for asynchronous I/O operation –Emulate delivery of POSIX signals –Make threads suspend/terminate itself (env. subsystems)  APCs are delivered when thread is in alertable wait state –WaitForMultipleObjectsEx(), SleepEx()

69 69  Special kernel APCs –Run in kernel mode, at IRQL 1 –Always deliverable unless thread is already at IRQL 1 or above –Used for I/O completion reporting from “arbitrary thread context” –Kernel-mode interface is linkable, but not documented Asynchronous Procedure Calls (APCs)

70 70  “Ordinary” kernel APCs –Always deliverable if at IRQL 0, unless explicitly disabled (disable with KeEnterCriticalRegion) User mode APCs –Used for I/O completion callback routines (see ReadFileEx, WriteFileEx); also, QueueUserApc –Only deliverable when thread is in “alertable wait” Asynchronous Procedure Calls (APCs)

71 71 Thread Object K U APC objects Asynchronous Procedure Calls (APCs)

72 72 IRQLs and CPU Time Accounting  Interval clock timer ISR keeps track of time  Clock ISR time accounting: –If IRQL<2, charge to thread’s user or kernel time –If IRQL=2 and processing a DPC, charge to DPC time –If IRQL=2 & not processing a DPC, charge to thread kernel time –If IRQL>2, charge to interrupt time

73 73 IRQLs and CPU Time Accounting  Since time servicing interrupts are NOT charged to interrupted thread, if system is busy but no process appears to be running, must be due to interrupt-related activity –Note: time at IRQL 2 or more is charged to the current thread’s quantum (to be described)

74 74 Interrupt Time Accounting  Task Manager includes interrupt and DPC time with the Idle process time  Interrupt activity is not charged to any thread/process –Process Explorer shows these as separate processes not really processes –Context switches for these are really # of interrupts & DPCs

75 75 Time Accounting Quirks  Looking at total CPU time for each process may not reveal where system has spent its time  CPU time accounting is driven by programmable interrupt timer –Normally 10 msec (15 msec on some MP Pentiums)  Thread execution and context switches between clock intervals NOT accounted –E.g., one or more threads run and enter a wait state before clock fires –Thus threads may run but never get charged  View context switch activity with Process Explorer –Add Context Switch Delta column

76 76  For waiting threads, user-mode utilities only display the wait reason  Example: pstat Looking at Waiting Threads

77 77 Wait Internals 1: Dispatcher Objects SizeType State Wait listhead Object-type- specific data Dispatcher Object (see \ntddk\inc\ddk\ntddk.h)  Any kernel object you can wait for is a “dispatcher object” –some exclusively for synchronization e.g. events, mutexes (“mutants”), semaphores, queues, timers –others can be waited for as a side effect of their prime function e.g. processes, threads, file objects –non-waitable kernel objects are called “control objects”  All dispatcher objects have a common header  All dispatcher objects are in one of two states –“signaled” vs. “nonsignaled” –when signalled, a wait on the object is satisfied –different object types differ in terms of what changes their state –wait and unwait implementation is common to all types of dispatcher objects

78 78 Object-type- specific data Wait Internals 2: Wait Blocks SizeType State Wait listhead SizeType State Wait listhead  Represent a thread’s reference to something it’s waiting for (one per handle passed to WaitFor…)  All wait blocks from a given wait call are chained to the waiting thread  Type indicates wait for “any” or “all”  Key denotes argument list position for WaitForMultipleObjects Object-type- specific data Dispatcher Objects Thread Objects WaitBlockList Wait blocks KeyType Next link List entry Object Thread KeyType Next link List entry Object Thread KeyType Next link List entry Object Thread

79 79 3.4. Windows APIs for Synchronization and IPC  Windows API constructs for synchronization and interprocess communication  Synchronization –Critical sections –Mutexes –Semaphores –Event objects  Synchronization through interprocess communication –Anonymous pipes –Named pipes –Mailslots

80 80 Critical Sections Only usable from within the same process  Critical sections are initialized and deleted but do not have handles  Only one thread at a time can be in a critical section  A thread can enter a critical section multiple times - however, the number of Enter- and Leave-operations must match  Leaving a critical section before entering it may cause deadlocks  No way to test whether another thread is in a critical section VOID InitializeCriticalSection( LPCRITICAL_SECTION sec ); VOID DeleteCriticalSection( LPCRITICAL_SECTION sec ); VOID EnterCriticalSection( LPCRITICAL_SECTION sec ); VOID LeaveCriticalSection( LPCRITICAL_SECTION sec ); BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec );

81 81 Critical Section Example /* counter is global, shared by all threads */ volatile int counter = 0; CRITICAL_SECTION crit; InitializeCriticalSection ( &crit ); /* … main loop in any of the threads */ while (!done) { _try { EnterCriticalSection ( &crit ); counter += local_value; LeaveCriticalSection ( &crit ); } _finally { LeaveCriticalSection ( &crit ); } } DeleteCriticalSection( &crit );

82 82 Synchronizing Threads with Kernel Objects The following kernel objects can be used to synchronize threads: –Processes –Threads –Files –Console input File change notifications Mutexes Events (auto-reset + manual-reset) Waitable timers DWORD WaitForSingleObject( HANDLE hObject, DWORD dwTimeout ); DWORD WaitForMultipleObjects( DWORD cObjects, LPHANDLE lpHandles, BOOL bWaitAll, DWORD dwTimeout );

83 83 Wait Functions - Details  WaitForSingleObject(): –hObject specifies kernel object –dwTimeout specifies wait time in msec dwTimeout == 0 - no wait, check whether object is signaled dwTimeout == INFINITE - wait forever  WaitForMultipleObjects(): –cObjects <= MAXIMUM_WAIT_OBJECTS (64) –lpHandles - pointer to array identifying these objects –bWaitAll - whether to wait for first signaled object or all objects Function returns index of first signaled object  Side effects: –Mutexes, auto-reset events and waitable timers will be reset to non- signaled state after completing wait functions

84 84 Mutexes Mutexes work across processes  First thread has to call CreateMutex()  When sharing a mutex, second thread (process) calls CreateMutex() or OpenMutex()  fInitialOwner == TRUE gives creator immediate ownership  Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()  ReleaseMutex() gives up ownership  CloseHandle() will free mutex object

85 85 Mutexes HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsa, BOOL fInitialOwner, LPTSTR lpszMutexName ); HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsa, BOOL fInitialOwner, LPTSTR lpszMutexName ); BOOL ReleaseMutex( HANDLE hMutex );

86 86 Mutex Example /* counter is global, shared by all threads */ volatile int done, counter = 0; HANDLE mutex = CreateMutex( NULL, FALSE, NULL ); /* main loop in any of the threads, ret is local */ DWORD ret; while (!done) { ret = WaitForSingleObject( mutex, INFINITE ); if (ret == WAIT_OBJECT_0) counter += local_value; else /* mutex was abandoned */ break;/* exit the loop */ ReleaseMutex( mutex ); } CloseHandle( mutex );

87 87 Comparison - POSIX mutexes  POSIX pthreads specification supports mutexes –Synchronization among threads in same process  Five basic functions: –pthread_mutex_init() –pthread_mutex_destroy() –pthread_mutex_lock() –pthread_mutex_unlock() –pthread_mutex_trylock()  Comparison: –pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex ); –pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0

88 88 Semaphores  Semaphore objects are used for resource counting –A semaphore is signaled when count > 0  Threads/processes use wait functions –Each wait function decreases semaphore count by 1 –ReleaseSemaphore() may increment count by any value –ReleaseSemaphore() returns old semaphore count

89 89 Semaphores HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsa, LONG cSemInit, LONG cSemMax, LPTSTR lpszSemName ); HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsa, LONG cSemInit, LONG cSemMax, LPTSTR lpszSemName ); HANDLE ReleaseSemaphore( HANDLE hSemaphore, LONG cReleaseCount, LPLONG lpPreviousCount );

90 90 Events  Multiple threads can be released when a single event is signaled (barrier synchronization) –Manual-reset event can signal several thread simultaneously; must be reset manually –PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event –Auto-reset event signals a single thread; event is reset automatically –fInitialState == TRUE - create event in signaled state

91 91 Events HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsa, BOOL fManualReset, BOOL fInititalState LPTSTR lpszEventName ); BOOL SetEvent( HANDLE hEvent ); BOOL ResetEvent( HANDLE hEvent ); BOOL PulseEvent( HANDLE hEvent );

92 92 Comparison - POSIX condition variables  pthread’s condition variables are comparable to events –pthread_cond_init() –pthread_cond_destroy()  Wait functions: –pthread_cond_wait() –pthread_cond_timedwait()  Signaling: –pthread_cond_signal() - one thread –pthread_cond_broadcast() - all waiting threads  No exact equivalent to manual-reset events

93 93 Anonymous pipes BOOL CreatePipe( PHANDLE phRead, PHANDLE phWrite, LPSECURITY_ATTRIBUTES lpsa, DWORD cbPipe ) main prog1 prog2 pipe Half-duplex character-based IPC  cbPipe: pipe byte size; zero == default  Read on pipe handle will block if pipe is empty  Write operation to a full pipe will block  Anonymous pipes are oneway

94 94 I/O Redirection using an Anonymous Pipe /* Create default size anonymous pipe, handles are inheritable. */ if (!CreatePipe (&hReadPipe, &hWritePipe, &PipeSA, 0)) { fprintf(stderr, “Anon pipe create failed\n”); exit(1); } /* Set output handle to pipe handle, create first processes. */ StartInfoCh1.hStdInput = GetStdHandle (STD_INPUT_HANDLE); StartInfoCh1.hStdError = GetStdHandle (STD_ERROR_HANDLE); StartInfoCh1.hStdOutput = hWritePipe; StartInfoCh1.dwFlags = STARTF_USESTDHANDLES; if (!CreateProcess (NULL, (LPTSTR)Command1, NULL, NULL, TRUE, 0, NULL, NULL, &StartInfoCh1, &ProcInfo1)) { fprintf(stderr, “CreateProc1 failed\n”); exit(2); } CloseHandle (hWritePipe);

95 95 Pipe example (contd.) /* Repeat (symmetrically) for the second process. */ StartInfoCh2.hStdInput = hReadPipe; StartInfoCh2.hStdError = GetStdHandle (STD_ERROR_HANDLE); StartInfoCh2.hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE); StartInfoCh2.dwFlags = STARTF_USESTDHANDLES; if (!CreateProcess (NULL, (LPTSTR)targv, NULL, NULL,TRUE,/* Inherit handles. */ 0, NULL, NULL, &StartInfoCh2, &ProcInfo2)) { fprintf(stderr, “CreateProc2 failed\n”); exit(3); } CloseHandle (hReadPipe); /* Wait for both processes to complete. */ WaitForSingleObject (ProcInfo1.hProcess, INFINITE); WaitForSingleObject (ProcInfo2.hProcess, INFINITE); CloseHandle (ProcInfo1.hThread); CloseHandle (ProcInfo1.hProcess); CloseHandle (ProcInfo2.hThread); CloseHandle (ProcInfo2.hProcess); return 0;

96 96 Named Pipes  Message oriented: –Reading process can read varying-length messages precisely as sent by the writing process  Bi-directional –Two processes can exchange messages over the same pipe  Multiple, independent instances of a named pipe: –Several clients can communicate with a single server using the same instance –Server can respond to client using the same instance  Pipe can be accessed over the network –location transparency  Convenience and connection functions

97 97 Using Named Pipes HANDLE CreateNamedPipe (LPCTSTR lpszPipeName, DWORD fdwOpenMode, DWORD fdwPipMode DWORD nMaxInstances, DWORD cbOutBuf, DWORD cbInBuf, DWORD dwTimeOut, LPSECURITY_ATTRIBUTES lpsa ); Use same flag settings for all instances of a named pipe  lpszPipeName: \\.\pipe\[path]pipename\\.\pipe\[path]pipename –Not possible to create a pipe on remote machine (. – local machine)  fdwOpenMode: –PIPE_ACCESS_DUPLEX, PIPE_ACCESS_INBOUND, PIPE_ACCESS_OUTBOUND  fdwPipeMode: –PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE –PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE –PIPE_WAIT or PIPE_NOWAIT (will ReadFile block?)

98 98 Named Pipes (contd.) BOOL PeekNamedPipe (HANDLE hPipe, LPVOID lpvBuffer, DWORD cbBuffer, LPDWORD lpcbRead, LPDWORD lpcbAvail, LPDWORD lpcbMessage);  nMaxInstances: –Number of instances, –PIPE_UNLIMITED_INSTANCES: OS choice based on resources  dwTimeOut –Default time-out period (in msec) for WaitNamedPipe()  First CreateNamedPipe creates named pipe –Closing handle to last instance deletes named pipe  Polling a pipe: –Nondestructive – is there a message waiting for ReadFile

99 99 Named Pipe Client Connections  CreateFile with named pipe name: –\\.\pipe\[path]pipename\\.\pipe\[path]pipename –\\servername\pipe\[path]pipename\\servername\pipe\[path]pipename –First method gives better performance (local server)  Status Functions: –GetNamedPipeHandleState –SetNamedPipeHandleState –GetNamedPipeInfo

100 100 Convenience Functions BOOL TransactNamedPipe( HANDLE hNamedPipe, LPVOID lpvWriteBuf, DWORD cbWriteBuf, LPVOID lpvReadBuf, DWORD cbReadBuf, LPDOWRD lpcbRead, LPOVERLAPPED lpa);  WriteFile / ReadFile sequence:

101 101 Convenience Functions BOOL CallNamedPipe( LPCTSTR lpszPipeName, LPVOID lpvWriteBuf, DWORD cbWriteBuf, LPVOID lpvReadBuf, DWORD cbReadBuf, LPDWORD lpcbRead, DWORD dwTimeOut);  CreateFile / WriteFile / ReadFile / CloseHandle: –dwTimeOut: NMPWAIT_NOWAIT, NMPWAIT_WIAT_FOREVER, NMPWAIT_USE_DEFAULT_WAIT :

102 102 Server: eliminate the polling loop BOOL ConnectNamedPipe (HANDLE hNamedPipe, LPOVERLAPPED lpo  lpo == NULL: –Call will return as soon as there is a client connection –Returns false if client connected between CreateNamed Pipe call and ConnectNamedPipe()  Use DisconnectNamedPipe to free the handle for connection from another client  WaitNamedPipe(): –Client may wait for server‘s ConnectNamedPipe()  Security rights for named pipes: –GENERIC_READ, GENERIC_WRITE, SYNCHRONIZE

103 103 Comparison with UNIX  UNIX FIFOs are similar to a named pipe –FIFOs are half-duplex –FIFOs are limited to a single machine –FIFOs are still byte-oriented, so its easiest to use fixed-size records in client/server applications –Individual read/writes are atomic  A server using FIFOs must use a separate FIFO for each client‘s response, although all clients can send requests via a single, well known FIFO  Mkfifo() is the UNIX counterpart to CreateNamedPipe()  Use sockets for networked client/server scenarios

104 104 Client Example using Named Pipe WaitNamedPipe (ServerPipeName, NMPWAIT_WAIT_FOREVER); hNamedPipe = CreateFile (ServerPipeName, GENERIC_READ | GENERIC_WRITE, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (hNamedPipe == INVALID_HANDLE_VALUE) { fptinf(stderr, Failure to locate server.\n"); exit(3); } /* Write the request. */ WriteFile (hNamedPipe, &Request, MAX_RQRS_LEN, &nWrite, NULL); /* Read each response and send it to std out. */ while (ReadFile (hNamedPipe, Response.Record, MAX_RQRS_LEN, &nRead, NULL)) printf ("%s", Response.Record); CloseHandle (hNamedPipe); return 0;

105 105 Server Example Using a Named Pipe hNamedPipe = CreateNamedPipe (SERVER_PIPE, PIPE_ACCESS_DUPLEX, PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT, 1, 0, 0, CS_TIMEOUT, pNPSA); while (!Done) { printf ("Server is awaiting next request.\n"); if (!ConnectNamedPipe (hNamedPipe, NULL) || !ReadFile (hNamedPipe, &Request, RQ_SIZE, &nXfer, NULL)) { fprintf(stderr, “Connect or Read Named Pipe error\n”); exit(4); } printf( “Request is: %s\n", Request.Record); /* Send the file, one line at a time, to the client. */ fp = fopen (File, "r"); while ((fgets (Response.Record, MAX_RQRS_LEN, fp) != NULL)) WriteFile (hNamedPipe, &Response.Record, (strlen(Response.Record) + 1) * TSIZE, &nXfer, NULL); fclose (fp); DisconnectNamedPipe (hNamedPipe); }/* End of server operation. */

106 106 Win32 IPC - Mailslots Mailslots bear some nasty implementation details; they are almost never used  Broadcast mechanism: –One-directional –Mutliple writers/multiple readers (frequently: one-to-many comm.) –Message delivery is unreliable –Can be located over a network domain –Message lengths are limited (w2k: < 426 byte)  Operations on the mailslot: –Each reader (server) creates mailslot with CreateMailslot() –Write-only client opens mailslot with CreateFile() and uses WriteFile() – open will fail if there are no waiting readers –Client‘s message can be read by all servers (readers)  Client lookup: \\*\mailslot\mailslotname –Client will connect to every server in network domain

107 107 Locate a server via mailslot hMS = CreateMailslot( “\\.\mailslot\status“); ReadFile(hMS, &ServStat); /* connect to server */ hMS = CreateMailslot( “\\.\mailslot\status“); ReadFile(hMS, &ServStat); /* connect to server */ App client 0 App client n Mailslot Servers While (...) { Sleep(...); hMS = CreateFile( “\\.\mailslot\status“);... WriteFile(hMS, &StatInfo } App Server Mailslot Client Message is sent periodically

108 108 Creating a mailslot HANDLE CreateMailslot(LPCTSTR lpszName, DWORD cbMaxMsg, DWORD dwReadTimeout, LPSECURITY_ATTRIBUTES lpsa);  lpszName points to a name of the form –\\.\mailslot\[path]name –Name must be unique; mailslot is created locally  cbMaxMsg is msg size in byte  dwReadTimeout –Read operation will wait for so many msec –0 – immediate return –MAILSLOT_WAIT_FOREVER – infinite wait

109 109 Opening a mailslot  CreateFile with the following names: –\\.\mailslot\[path]name - retrieve handle for local mailslot –\\host\mailslot\[path]name - retrieve handle for mailslot on specified host –\\domain\mailslot\[path]name - returns handle representing all mailslots on machines in the domain –\\*\mailslot\[path]name - returns handle representing mailslots on machines in the system‘s primary domain: max mesg. len: 400 bytes –Client must specifiy FILE_SHARE_READ flag  GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts

110 Thoughts Change Life 意念改变生活


Download ppt "Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒."

Similar presentations


Ads by Google