Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

Similar presentations


Presentation on theme: "Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒."— Presentation transcript:

1 Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒

2 22 Content  4.1. The Concept of Processes and Threads  4.2. Windows Processes and Threads  4.3. Windows Process and Thread Internals  4.4. Windows OS Thread Scheduling  4.5. Advanced Windows Scheduling

3 33 Process Concept  An operating system executes programs: –Batch system – jobs –Time-shared systems – user programs or tasks  Process – a program in execution –Process execution must progress sequentially  A process includes: –CPU state (one or multiple threads) –Text & data section –Resources such as open files, handles, sockets

4 44 Process Concept  Traditionally, process used to be unit of scheduling –(i.e. no threads)  However, like most modern operating systems, Windows schedules threads  Our discussion assumes thread scheduling

5 55 Thread States  Five-state diagram for thread scheduling: –init: The thread is being created –ready: The thread is waiting to be assigned to a CPU –running: The thread’s instructions are being executed –waiting: The thread is waiting for some event to occur –terminated: The thread has finished execution

6 66 init ready waiting running terminated scheduler dispatch waiting for I/O or event I/O or event completion interrupt quantum expired admittedexit Thread States

7 77 Process and Thread Control Blocks  Information associated with process: Process Control Block (PCB) –Memory management information –Accounting information –Process-global vs. thread-specific  Information associated with thread: Thread Control Block (TCB) –Program counter –CPU registers –CPU scheduling information –Pending I/O information

8 88 Process Control Block (PCB)  Windows implementation of PCB is split in multiple data structures Program Counter Parent PID … Handle Table Process ID (PID) Registers Next Process Block Image File Name PCB List of Thread Control Blocks List of open files … Next TCB … Thread Control Block (TCB)

9 99 CPU Switch from Thread to Thread Thread T 1 executing ready or waiting Save state into TCB 2 Reload state from TCB 1 Save state into TCB 1 Reload state from TCB 2 Interrupt or system call Thread T 2 executing Interrupt or system call ready or waiting

10 10 Context Switch  Save the state of the old thread and load the saved state for the new thread  Context-switch time is overhead  Thread context-switch can be implemented in kernel or user mode  Interaction with MMU is required when switching between threads in different processes

11 11 Thread Scheduling Queues  Ready queue –Maintains set of all threads ready and waiting to execute –There might be multiple ready queues, sorted by priorities  Device queue –Maintains set of threads waiting for an I/O device –There might be multiple queues for different devices  Threads migrate between the various queues

12 12 Ready Queue and I/O Device Queues CPU Ready queue I/O 1 wait I/O 2 wait I/O n wait I/O n queue I/O 1 queue I/O occurs Time-out ReleaseDispatch

13 13 Optimization Criteria  CPU scheduling uses heuristics to manage the tradeoffs among contradicting optimization criteria.  Schedulers are optimized for certain workloads –Interactive vs. batch processing –I/O-intense vs. compute-intense

14 14 Common Optimization Criteria  Maximize CPU utilization  Maximize throughput  Minimize turnaround time  Minimize waiting time  Minimize response time

15 15 Basic Scheduling Considerations  What invokes the scheduler?  Which assumptions should a scheduler rely on?  What are its optimization goals?  Rationale: –Multiprogramming maximizes CPU utilization –Thread execution experiences cycles of compute- and I/O-bursts –Scheduler should consider CPU burst distribution

16 16 Alternating Sequence of CPU and I/O Bursts … load val inc val read file wait for I/O inc count add data, val write file wait for I/O load val inc val read from file wait for I/O … CPU burst I/O burst  Threads can be described as:  I/O-bound – spends more time doing I/O than computations –many short CPU bursts  CPU-bound – spends more time doing computations –few very long CPU bursts

17 17 Histogram of CPU-burst Times Burst duration (msec) distribution  Many short CPU bursts are typical  Exact figures vary greatly by process and computer

18 18 Schedulers  Long-term scheduler (or job scheduler) –Select which processes with their threads should be brought into the ready queue –Takes MM into consideration (swapped-out processes) –Controls degree of multiprogramming –Invoked infrequently, may be slow  Short-term scheduler (or CPU scheduler) –Select which thread should be executed next and allocate CPU –Invoked frequently, must be fast  Windows has no dedicated long-term scheduler

19 19 CPU Scheduler  Select from among the threads in memory that are ready to execute, and allocate the CPU to one of them  CPU scheduling decisions may take place when a thread –1.Switches from running to waiting state –2.Switches from running to ready state –3.Switches from waiting to ready –4.Terminates  Scheduling under 1 and 4 is nonpreemptive  All other scheduling is preemptive

20 20 Dispatcher  Dispatcher module gives control of CPU to the thread selected by the short-term scheduler; this involves: –switch context –switch to user mode –jump to proper location in user program to restart that program  Dispatch latency – time it takes for the dispatcher to stop one thread and start another running  Windows scheduling is event-driven –No central dispatcher module in the kernel

21 21 Scheduling Algorithms: FIFO  First-In, First-Out  Also known as First-Come, First-Served (FCFS) Thread Burst Time T 1 20 T2 5 T2 5 T 3 4  Suppose threads arrive in the order: T 1, T 2, T 3 –The Gantt Chart for the schedule is:

22 22 Scheduling Algorithms: FIFO  Waiting time for T 1 = 0; T 2 = 20; T 3 = 25  Average waiting time: ( )/3 = 15  Convoy effect: –short thread behind long threads experience long waiting time T1T1 T2T2 T3T

23 23 FIFO Scheduling (Cont.)  Suppose that the threads arrive in the order T 2, T 3, T 1.  The Gantt chart for the schedule is:  Waiting time for T 1 = 9; T 2 = 0 ; T 3 = 5  Average waiting time: ( )/3 = 4.66  Much better than previous case T1T1 T3T3 T2T

24 24 Scheduling Algorithms: Round Robin (RR)  Preemptive version of FIFO scheduling algorithm  Each thread gets a small unit of CPU time (quantum), –usually milliseconds  After this time has elapsed, the thread is preempted and added to the end of the ready queue  Each of n ready thread gets 1/n of the CPU time in chunks of at most quantum q time units at once  Of n threads, no one waits more than (n-1)q time units

25 25 Scheduling Algorithms:Round Robin (RR)  Performance  q large  FIFO  q small  q must be large with respect to context switch –otherwise overhead is too high

26 26 Example of RR with Quantum = 10  Assume we have: –Thread Burst Time –T 1 23 –T2 7–T2 7 –T 3 38 –T 4 14  Assume all threads have same priority, the Gantt chart is: T1T1 T2T2 T3T3 T4T4 T1T1 T3T3 T4T4 T1T1 T3T3 T3T

27 27 Example of RR with Quantum = 10  Round-Robin favors CPU-intense over I/O-intense threads  Priority-elevation after I/O completion can provide a compensation  Windows uses Round-Robin with a priority-elevation scheme

28 28 Round Robin Performance  Shorter quantum yields more context switches  Longer quantum yields shorter average turnaround time Thread execution time: quantum context switches

29 29 Scheduling Algorithms: Priority Scheduling  A priority number (integer) is associated with each thread  CPU is allocated to the thread with the highest priority –Preemptive –Non-preemptive

30 30 Priority Scheduling - Starvation  Starvation is a problem: –low priority threads may never execute  Solutions: –1) Decreasing priority & aging: the Unix approach Decrease priority of CPU-intense threads Exponential averaging of CPU usage to slowly increase priority of blocked threads –2) Priority Elevation: the Windows/VMS approach Increase priority of a thread on I/O completion System gives starved threads an extra burst

31 31 Multilevel Queue  Ready queue is partitioned into separate queues: –Real-time (system, multimedia) –Interactive  Queues may have different scheduling algorithm –Real-Time – RR –Interactive – RR + priority-elevation + quantum stretching

32 32 Multilevel Queue  Scheduling must be done between the queues  Fixed priority scheduling (i.e., serve all from real-time threads then from interactive) –Possibility of starvation  Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its threads –CPU reserves

33 33 Multilevel Queue Scheduling  Windows uses strict Round-Robin for real-time threads  Priority-elevation can be disabled for non-RT threads Real-time system threads Real-time user threads System threads Interactive user threads background threads High priority Low priority

34 34 Process Creation  Parent process creates children processes, which create other processes, forming a tree of processes –Processes start with one initial thread  Resource sharing models –Parent and children share all resources –Children share subset of parent’s resources –Parent and child share no resources  Execution –Parent’s and children's’ threads execute concurrently –Parent waits until children terminate

35 35 Process Creation (Cont.)  How to set up an address space –Child can be duplicate of parent –Child may have a program loaded into it  UNIX example –fork() system call creates new process –exec() system call used after a fork to replace the process’ memory space with a new program  Windows example –CreateProcess() system call create new process and –loads program for execution

36 36 Processes Tree on a UNIX System

37 37 Process Termination  Last thread inside a process executes last statement and returns control to operating system (exit) –Parent may receive return code (via wait) –Process’ resources are deallocated by OS

38 38 Process Termination  Parent may terminate children processes (kill) –Child has exceeded allocated resources –Task assigned to child is no longer required –Parent is exiting  OS typically does not allow child to continue if its parent terminates (depending on creation flags) –Cascading termination inside process groups

39 39 Single and Multithreaded Processes codedatafiles registersstack Thread single-threaded codedatafiles registers stack Thread multi-threaded stack registers stack registers Thread

40 40 Benefits of Multithreading  Higher Responsiveness –Dedicated threads for handling user events  Simpler Resource Sharing –All threads in a process share same address space  Economy - fewer context switches –If threading implemented in user-space  Utilization of Multiprocessor Architectures –Multiple threads may run in parallel

41 41 User Threads  Thread management within a user-level threads library –Process is unit of CPU scheduling from kernel perspective  Examples –POSIX Pthreads –Mach C-threads –Solaris threads –Fibers on Windows

42 42 Kernel Threads  Supported by the Kernel –Thread is unit of CPU scheduling  Examples –Windows –Solaris –OSF/1 –Linux Tasks can act like threads by sharing kernel data structures

43 43 Multithreading Models  How are user-level threads mapped on kernel threads?  Many-to-One –Many user-mode threads mapped on a single kernel thread  One-to-One –Each user-mode thread mapped on a separate kernel thread  Many-to-Many –Set of user-mode threads mapped on set of kernel threads

44 44 Many-to-One Model  Used on systems that do not support kernel threads  Example: –POSIX Pthreads –Mach C-Threads Kernel thread User Thread

45 45 One-to-One Model  Each user-level thread maps to kernel thread  Examples –Windows –OS/2 Kernel thread User Thread Kernel thread User Thread Kernel thread User Thread

46 46 Many-to-Many Model  Allows many user level threads to be mapped to many kernel threads.  Allows OS to create a sufficient number of kernel threads.  Example –Solaris 2 Kernel thread User Thread Kernel thread

47 47 Problems with Multithreading  Semantics of fork()/exec() or CreateProcess() system calls  Coordinated termination  Signal handling  Global data, errno, error handling  Thread specific data  Reentrant vs. non-reentrant system calls

48 48 Pthreads  a POSIX standard (IEEE c) API for thread creation and synchronization  API specifies behavior of the thread library, not an implementation  Implemented on many UNIX operating systems  Services for Unix (SFU) implement PThreads on Windows

49 Windows Processes and Threads

50 50 Windows Processes  What is a process? –Represents an instance of a running program you create a process to run a program starting an application creates a process –Process defined by: Address space Resources (e.g. open handles) Security profile (token)  Every process starts with one thread –First thread executes the program’s “main” function Can create other threads in the same process Can create additional processes

51 51 Windows Threads  What is a thread? –An execution context within a process –Unit of scheduling (threads run, processes don’t run)  All threads in a process share same process address space –Services provided so threads can synchronize access to shared resources (critical sections, mutexes, events, semaphores)  All threads in the system are scheduled as peers to all others, without regard to their “parent” process

52 52 Per-Process Data  Virtual address space –program code, global storage, heap storage, threads’ stacks  Working set –physical memory “owned” by the process  Access token –includes security identifiers  Handle table for Windows kernel objects

53 53 Per-Process Data  Environment strings  Command line  These are common to all threads in the process, but separate and protected between processes

54 54 Per-Thread Data  User-mode stack –arguments passed to thread, automatic storage, call frames  Kernel-mode stack (for system calls)  Thread Local Storage (TLS) –array of pointers to allocate unique data  Scheduling state (Wait, Ready, Running, etc.) and priority

55 55 Per-Thread Data  Hardware context –Program counter, stack pointer, register values –Current access mode (user mode or kernel mode) –(saved in CONTEXT structure if not running)  Access token (optional -- overrides process’s if present)

56 56 Process and Thread Identifiers  Every process and every thread has an identifier  Generically: “client ID” (debugger shows as “CID”) –A.K.A. “process ID” and “thread ID”, respectively –Process IDs and thread IDs are in the same “number space”  ID identifies request process or thread to its subsystem server process, in API calls that need server’s help

57 57 Process and Thread Identifiers  Visible in: –PerfMon, Task Manager (for processes), –Process Viewer (for processes), kernel debugger, etc.  IDs are unique among all existing processes and threads –might be reused as soon as a process or thread is deleted

58 58 Process-Related Performance Counters Object: CounterFunction Process:%PrivilegedTimePercentage of time that the threads in the process have run in kernel mode Process:%ProcessorTimePercentage of CPU time that threads have used during specified interval %PrivilegedTime + %UserTime Process:%UserTimePercentage of time that the threads in the process have run in user mode Process: ElapsedTimeTotal lifetime of process in seconds Process: ID ProcessPID – process IDs are re-used Process: ThreadCountNumber of threads in a process

59 59 Thread-Related Performance Counters Object: CounterFunction Process: Priority BaseBase priority of process: starting priority for thread within process Thread:%PrivilegedTimePercentage of time that the thread was run in kernel mode Thread:%ProcessorTimePercentage of CPU time that the threads has used during specified interval %PrivilegedTime + %UserTime Thread:%UserTimePercentage of time that the thread has run in user mode Thread: ElapsedTimeTotal lifetime of process in seconds Thread: ID ProcessPID – process IDs are re-used Thread: ID ThreadThread ID – re-used

60 60 Thread-Related Performance Counters (contd.) Object: CounterFunction Thread: Priority BaseBase priority of thread: may differ from the thread‘s starting priority Thread: Priority CurrentThe thread‘s current dynamic priority Thread: Start AddressThe thread‘s starting virtual address (the same for most threads) Thread: Thread StateValue from 0 through 7 – current state of thread Thread: Thread Wait Reason Value from 0 through 19 – reason why the thread is in wait state

61 61 Tools for Obtaining Process & Thread Information  Many overlapping tools –most show one item the others do not  Built-in tools in Windows 2000/XP: –Task Manager, Performance Tool –Tasklist (new in XP)  Support Tools –pviewer - process and thread details (GUI) –pmon -rocess list (character cell) –tlist-shows process tree, thread details (character cell)

62 62 Tools for Obtaining Process & Thread Information  Resource Kit tools: –apimon - system call and page fault monitoring (GUI) –oh – display open handles (character cell) –pviewer - processes & threads and security details (GUI) –ptree –display process tree & kill remote processes (GUI) –pulist-lists processes and usernames (character cell) –pstat -process/threads & driver addresses (character cell) –qslice - can show process-relative thread activity (GUI)

63 63 Tools for Obtaining Process & Thread Information  Tools from  Process Explorer: super Task Manager –shows open files, loaded DLLs, security info, etc.  Pslist –list processes on local or remote systems  Ntpmon –shows process/thread create/deletes –and context switches on MP systems only  Listdlls –displays full path of EXE & DLLs loaded in each process

64 64 What Are Task Manager’s “Applications”?  A meaningless term at the OS level –Not a list of processes –Not a list of “tasks” (another meaningless term) –It’s a list of top level visible windows in your session that meet certain criteria

65 65 What Are Task Manager’s “Applications”?  What does the status column mean?  Running: –Windows don’t run—threads do –Running displayed only when owning thread is waiting for a window message (e.g. not running!)  Not Responding: not waiting for window messages  To map a window to a process –right-click on a window and select “Go to process”

66 66 What Are Task Manager’s “Applications”?

67 67 Process Explorer (Sysinternals)  Super Task Manager  Shows: –full image path, command line, –environment variables, parent process, –security access token, open handles, –loaded DLLs & mapped files

68 68 Process Explorer (Sysinternals)

69 69 Lab: The Process List Run Process Explorer & maximize window Run Task Manager – click on Processes tab Arrange windows so you can see both Notice process tree vs flat list in Task Manager If parent has exited, process is left justified

70 70 Lab: The Process List  Sort on first column (“Process”) and note tree view disappears  Click on View->Show Process Tree (or CTRL+T) to bring it back  Notice description and company name columns  Hover mouse over image to see full path of image  Right click on a process and choose “Google”

71 71 Lab: Refresh Highlighting  Change update speed to paused by pressing space bar  Run Notepad  In ProcExp, hit F5 and notice new process  Exit Notepad  In ProcExp, hit F5 and notice Notepad in red  Uses –Understanding process startup sequences –Detecting appearance of processes coming and going

72 72 Process Performance  Click on Performance Tab of process properties –Note: all these numbers can be configured as columns

73 73 Thread Details  Process Explorer “Threads” tab shows which thread(s) are running –Start address represents where the thread began running (not where it is now) –Click Module to get details on module containing thread start address

74 74 Thread Start Functions  Process Explorer can map the addresses within a module to the names of functions –This can help identify which component within a process is responsible for CPU usage  Requires access to: –Symbol file for that module –Proper version of Dbghelp.dll

75 75 Thread Start Functions  By default, Process Explorer looks for:  Dbghelp.dll: –in the default Windows Debugging Tools install directory  Symbols: –_NT_SYMBOL_PATH environment variable  Can also specify with Options->Configure Symbols

76 76 Call Stacks Function 2 Function 1 Function 3  Process Explorer can also show the thread call stack –Represents sequence of functions called  Important if start address doesn’t indicate what the thread is doing –E.g. if it’s a generic library start routine

77 77 Call Stacks  Click Stack to view call stack –Lists functions in reverse chronological order  Note that start address on Threads tab is different than first function shown in stack –This is because all user threads start in a Windows library function which calls the programmed start address

78 78 Example: Viewing Stacks  Problem: Powerpoint was hanging for 1 minute on startup  Thread stack shows waiting on a printer driver

79 79 Suspending Processes  Process Explorer can suspend a process  Why would you want to do this? –You’ve started a long running job but want to pause it to do something else Lowering the priority still leaves it running… –You’ve started a long download but want to have your network bandwidth temporarily –Some multi-service system process activity is due to other processes calling upon their services Suspend a process that is consuming CPU time to see what that does to the system process in question

80 80 Lab: Suspend  Start Notepad  From a command prompt:  Suspend Notepad process with Process Explorer  Try to switch back to Notepad (should not respond)  Open Task Manager and look at Notepad’s status in the applications tab  Resume Notepad

81 81 Processes Jobs  Jobs are collections of processes  Can be used to specify limits on CPU, memory, and security  Enables control over some unique process & thread settings not available through any process or thread system call –E.g. length of thread time slice Job Processes

82 82 Jobs  How do processes become part of a job?  Job object has to be created (CreateJobObject)  Then processes are explicitly added (AssignProcessToJob) –Processes created by processes in a job automatically are part of the job Unless restricted, processes can “break away” from a job  Then quotas and limits are defined (SetInformationJobObject) –Examples on next slide…

83 83 Process Lifetime  Created as an empty shell  Address space created with only ntdll and the main image unless created by POSIX fork()  Handle table created empty or populated via duplication from parent  Process is partially destroyed on last thread exit  Process totally destroyed on last dereference

84 84 Thread Lifetime  Created within a process with a CONTEXT record –Starts running in the kernel but has a trap frame to return to user mode  Threads run until they: –The thread returns to the OS –ExitThread is called by the thread –TerminateThread is called on the thread –ExitProcess is called on the process

85 85 Why Do Processes Exit? (or Terminate?)  Normal: Application decides to exit (ExitProcess)  Usually due to a request from the UI  or: CRTL does ExitProcess when primary thread function (main, WinMain, etc.) returns to caller –this forces TerminateThread on the process’s remaining threads –or, any thread in the process can do an explicit ExitProcess

86 86 Why Do Processes Exit? (or Terminate?)  Orderly exit requested from the desktop (ExitProcess) –e.g. “End Task” from Task Manager “Tasks” tab –Task Manager sends a WM_CLOSE message to the window’s message loop… –…which should do an ExitProcess (or equivalent) on itself

87 87 Why Do Processes Exit? (or Terminate?)  Forced termination (TerminateProcess) –if no response to “End Task” in five seconds, Task Manager presents End Program dialog (which does a TerminateProcess) –or: “End Process” from Task Manager Processes tab  Unhandled exception –Covered in Unit 4.3 (Process and Thread Internals)

88 88 Why Do Processes Exit? (or Terminate?)

89 89 Job Settings  Quotas and restrictions: –Quotas: total CPU time, # active processes, per-process CPU time, memory usage –Run-time restrictions: priority of all the processes in job; processors threads in job can run on –Security restrictions: limits what processes can do Not acquire administrative privileges Not accessing windows outside the job, no reading/writing the clipboard

90 90 Job Settings –Scheduling class: number from 0-9 (5 is default) - affects length of thread timeslice (or quantum) E.g. can be used to achieve “class scheduling” (partition CPU)

91 91 Jobs  Examples where Windows OS uses jobs: –Add/Remove Programs (“ARP Job”) –WMI provider –RUNAS service (SecLogon) uses jobs to terminate processes at log out SU from NT4 ResKit didn’t do this  Process Explorer highlights processes that are members of jobs –Color can be configured with Options->Configure Highlighting –For processes in a job, click on Job tab in process properties to see details

92 92 Lab: WMI Job  Jobs are used by WMI –Example: run Psinfo (Sysinternals) and pause output

93 93 Lab: RUNAS Job 1. In a command prompt: RUNAS /USER:xxx CMD (where xxx is some other local account) 2. In ProcExp, find newly created cmd.exe process –Who is the father? 3. Run Notepad from new CMD window 4. Double click on newly highlighted process & click on Job tab

94 94 Programming Slides NOTE: The remaining slides are for use in a class that covers the programming aspects of the OS (vs a class aimed at system administrators who are not doing programming)

95 95 Process Windows APIs  CreateProcess  OpenProcess  GetCurrentProcessId - returns a global ID  GetCurrentProcess - returns a handle  ExitProcess  TerminateProcess - no DLL notification  Get/SetProcessShutdownParameters  GetExitCodeProcess  GetProcessTimes  GetStartupInfo

96 96 Windows Thread APIs  CreateThread  CreateRemoteThread  GetCurrentThreadId - returns global ID  GetCurrentThread - returns handle  SuspendThread/ResumeThread  ExitThread  TerminateThread - no DLL notification  GetExitCodeThread  GetThreadTimes  Windows 2000 adds: –OpenThread –new thread pooling APIs

97 97 Fibers  Implemented completely in user mode –no “internals” ramifications –Fibers are still scheduled as threads –Fiber APIs allow different execution contexts within a thread stack fiber-local storage some registers (essentially those saved and restored for a procedure call) cooperatively “scheduled” within the thread –Analogous to threading libraries under many Unix systems –Analogous to co-routines in assembly language –Allow easy porting of apps that “did their own threads” under other systems

98 98 Process Creation BOOL CreateProcess( LPCSTR lpApplicationName, LPSTR lpCommandLine, LPSECURITY_ATTRIBUTES lpProcessAttributes, LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles, DWORD dwCreationFlags, LPVOID lpEnvironment, LPCSTR lpCurrentDirectory, LPSTARTUPINFO lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation)  No parent/child relation in Win32  CreateProcess() – new process with primary thread

99 99 typedef struct _PROCESS_INFORMATION { HANDLE hProcess; HANDLE hThread; DWORD dwProcessId; DWORD dwThreadId; } PROCESS_INFORMATION; Parameters  fdwCreate: –CREATE_SUSPENDED, DETACHED_PROCESS, CREATE_NEW_CONSOLE, CREATE_NEW_PROCESS_GROUP  lpStartupInfo: –Main window appearance –Parent‘s info: GetStartupInfo –hStdIn, hStdOut, hStdErr fields for I/O redirection  lpProcessInformation: –Ptr to handle & ID of new proc/thread

100 100 UNIX & Win32 comparison  Windows API has no equivalent to fork()  CreateProcess() similar to fork()/exec()  UNIX $PATH vs. lpCommandLine argument –Win32 searches in dir of curr. Proc. Image; in curr. Dir.; in Windows system dir. (GetSystemDirectory); in Windows dir. (GetWindowsDirectory); in dir. Given in PATH  Windows API has no parent/child relations for processes  No UNIX process groups in Windows API –Limited form: group = processes to receive a console event

101 101 Windows API Thread Creation cbStack == 0: thread‘s stack size defaults to primary thread‘s size HANDLE CreateThread ( LPSECURITY_ATTRIBUTES lpsa, DWORD cbStack, LPTHREAD_START_ROUTINE lpStartAddr, LPVOID lpvThreadParm, DWORD fdwCreate, LPDWORD lpIDThread)  lpstartAddr points to function declared as DWORD WINAPI ThreadFunc(LPVOID)  lpvThreadParm is 32-bit argument  LPIDThread points to DWORD that receives thread ID non-NULL pointer !

102 102 VOID ExitProcess( UINT uExitCode); BOOL TerminateProcess( HANDLE hProcess, UINT uExitCode); BOOL GetExitCodeProcess( HANDLE hProcess, LPDWORD lpExitCode); Exiting and Terminating a Process  Shared resources must be freed before exiting –Mutexes, semaphores, events –Use structured exception handling  But: _finally, _except handlers are not executed on ExitProcess;  no SEH on TerminateProcess

103 103 VOID ExitThread( DWORD devExitCode )  When the last thread in a process terminates, the process itself terminates (TerminateThread() does not execute final SEH)  Thread continues to exist until last handle is closed (CloseHandle()) BOOL GetExitCodeThread ( HANDLE hThread, LPDWORD lpdwExitCode)  Returns exit code or STILL_ACTIVE Windows API Thread Termination

104 104  Each thread has suspend count  Can only execute if suspend count == 0  Thread can be created in suspended state DWORD ResumeThread (HANDLE hThread) DWORD SuspendThread(HANDLE hThread)  Both functions return suspend count or 0xFFFFFFFF on failure Suspending and Resuming Threads

105 105 Synchronization & Remote Threads  WaitForSingleObject() and WaitForMultipleObjects() with thread handles as arguments perform thread synchronization –Waits for thread to become signaled –ExitThread(), TerminateThread(), ExitProcess() set thread objects to signaled state  CreateRemoteThread() allows creation of thread in another process –Not implemented in Windows 9x  C library is not thread-safe; use libcmt.lib instead –#define _MT before any include –Use _beginthreadex/_endthreadex instead of Create/ExitThread

106 106 Windows Process and Thread Internals  Data Structures for each process/thread:  Executive process block (EPROCESS)  Executive thread block (ETHREAD)  Win32 process block  Process environment block  Thread environment block

107 107 Windows Process and Thread Internals Process environment block Thread environment block Process block (EPROCESS) Thread block (ETHREAD) Win32 process block Handle table... Process address space System address space

108 108 Process  Container for an address space and threads  Associated User-mode Process Environment Block (PEB)  Primary Access Token  Quota, Debug port, Handle Table etc  Unique process ID  Queued to the Job, global process list and Session list  MM structures like the WorkingSet, VAD tree, AWE etc

109 109 Thread  Fundamental schedulable entity in the system  Represented by ETHREAD that includes a KTHREAD  Queued to the process (both E and K thread)  IRP list, Impersonation Access Token  Unique thread ID  Associated User-mode Thread Environment Block (TEB)  User-mode stack, Kernel-mode stack  Processor Control Block (in KTHREAD) for CPU state when not running

110 110 Process Object Handle Table VAD object Virtual Address Space Descriptors Access Token Thread... Access Token See kernel debugger commands: dt (see next slide) !process !thread !token !handle !object Processes & Threads Internal Data Structures

111 111 Process/Thread Kernel Debugger Commands  !process [/s Session] [Address/Pid [Flags]] –!process – display current process (not full details) –!process 342 – display full details of process 342 –!process 829fa030 – display process identified by EPROCESS address –!process 0 0 – summary display of all processes –!process 0 7 – full details of all processes

112 112 Process/Thread Kernel Debugger Commands  !thread [Address [Flags]] –!thread – current thread –!thread 826e8898 display thread identified by ETHREAD address  To view user stack, must set process context: –.process –.context  !peb [Address]  !teb [Address]

113 113 Physical address of Page Directory root of the process’s Virtual Address Descriptor tree Process Block (!process)

114 114 Process Block Layout Quota Block Exit Status Primary Access Token Process ID Parent Process ID Exception Port Debugger Port Handle Table Process Environment Block Create and Exit Time Next Process Block Image File Name Process Priority Class Memory Management Information EPROCESS Kernel Process Block (or PCB) Image Base Address Win32 Process Block Dispatcher Header Processor Affinity Kernel Time User Time Inwwap/Outswap List Entry Process Spin Lock Resident Kernel Stack Count Process Base Priority Default Thread Quantum Process State Thread Seed Disable Boost Flag Process Page Directory KTHREAD...

115 115 Process Block Layout lkd> dt nt!_EPROCESS +0x000 Pcb : _KPROCESS +0x06c ProcessLock : _EX_PUSH_LOCK +0x070 CreateTime : _LARGE_INTEGER +0x078 ExitTime : _LARGE_INTEGER +0x080 RundownProtect : _EX_RUNDOWN_REF +0x084 UniqueProcessId : Ptr32 Void +0x088 ActiveProcessLinks : _LIST_ENTRY +0x090 QuotaUsage : [3] Uint4B +0x09c QuotaPeak : [3] Uint4B +0x0a8 CommitCharge : Uint4B +0x0ac PeakVirtualSize : Uint4B +0x0b0 VirtualSize : Uint4B.  NOTE: Add “-r” to recurse through substructures

116 116 THREAD 83160f60 Cid 9f.3d Teb: 7ffdc000 Win32Thread: e153d2c8 WAIT: (WrUserRequest) UserMode Non-Alertable 808e9d60 SynchronizationEvent Not impersonating Owning Process 81b44880 WaitTime (seconds) Context Switch Count 2697 LargeStack UserTime 0:00: KernelTime 0:00: Start Address kernel32!BaseProcessStart (0x77e8f268) Win32 Start Address 0x020d9d98 Stack Init f Current f7817bb0 Base f Limit f Call 0 Priority 14 BasePriority 8 PriorityDecrement 6 DecrementCount 13 Kernel stack not resident. ChildEBP RetAddr Args to Child f7817bb0 8008f ntoskrnl!KiSwapThreadExit f7817c50 de0119ec ntoskrnl!KeWaitForSingleObject+0x2a0 f7817cc0 de0123f win32k!xxxSleepThread+0x23c f7817d10 de01f2f win32k!xxxInternalGetMessage+0x504 f7817d80 800bab win32k!NtUserGetMessage+0x58 f7817df0 77d887d ntoskrnl!KiSystemServiceEndAddress+0x4 0012fef user32!GetMessageW+0x30 Address of ETHREAD Thread ID Address of thread environment block Objects being waited on Thread state Address of system service dispatch table Priority Information Actual thread start address Stack trace Address of user thread function Process ID Thread Block (!thread)

117 117 Thread Block ETHREAD Create and Exit Time Process ID Thread Start Address Impersonation Information LPC Message Information EPROCESS Access Token KTHREAD Timer Information Pending I/O Requests Total User Time Total Kernel Time Thread Scheduling Information Synchronization Information List of Pending APCs Timer Block and Wait Blocks List of Objects Thread is Waiting On System Service Table TEB KTHREAD Thread Local Storage Array Kernel Stack Information Dispatcher Header Trap Frame

118 118 Thread Block (!strct ethread) lkd> dt nt!_ETHREAD +0x000 Tcb : _KTHREAD +0x1c0 CreateTime : _LARGE_INTEGER +0x1c0 NestedFaultCount : Pos 0, 2 Bits +0x1c0 ApcNeeded : Pos 2, 1 Bit +0x1c8 ExitTime : _LARGE_INTEGER +0x1c8 LpcReplyChain : _LIST_ENTRY +0x1c8 KeyedWaitChain : _LIST_ENTRY +0x1d0 ExitStatus : Int4B +0x1d0 OfsChain : Ptr32 Void +0x1d4 PostBlockList : _LIST_ENTRY +0x1dc TerminationPort : Ptr32 _TERMINATION_PORT +0x1dc ReaperLink : Ptr32 _ETHREAD

119 119 Process Environment Block  Mapped in user space  Image loader, heap manager, Windows system DLLs use this info  View with !peb or dt nt!_peb Image base address Module list Thread-local storage data Code page data Critical section time-out Number of heaps Heap size info GDI shared handle table OS version no info Image version info Image process affinity mask Process heap

120 120 Thread Environment Block  User mode data structure  Context for image loader and various Windows DLLs  View with !teb or dt nt!_teb Exception list Stack base Stack limit Thread ID Active RPC handle LastError value Count of owned crit. sect. Current locale User32 client info GDI32 info OpenGL info TLS array Subsyst. TIB Fiber info PEB Winsock data

121 121 Flow of CreateProcess() 1. Open image file (.EXE) to be executed inside the process 2. Create Windows NT executive process object 3. Create initial thread 1.stack, context, Win NT executive thread object) 4. Notify Windows subsystem of new process so that it can set up for new proc.& thread 5. Start execution of initial thread 1.unless CREATE_SUSPENDED was specified) 6. In context of new process/thread: 1.complete initialization of address space (load DLLs) 2.and begin execution of the program

122 122 Open EXE and create selection object Create NT process object Create NT thread object Notify Windows subsystem Set up for new process and thread Start execution of the initial thread Return to caller Final process/image initialization Start execution at entry point to image Creating process Windows subsystem New process Stages Windows follows to create a process

123 123 CreateProcess: some notes  CreationFlags: independent bits for priority class -> NT assigns lowest-prio class set  Default prio class is normal unless creator has prio class idle  If real-time prio class is specified and creator has insufficient privileges: prio class high is used  Caller‘s current desktop is used if no desktop is specified Priority classes: Real-time High Normal idle

124 124 Opening the image to be executed What kind of application is it? Run CMD.EXERun NTVDM.EXEUse.EXE directly Run NTVDM.EXERun POSIX.EXERun OS2.EXE Win16Windows OS/2 1.x MS-DOS.EXE,.COM, or.PIF MS-DOS.BAT or.CMD POSIX

125 125 If executable has no Windows format...  CreateProcess uses Windows “support image”  No way to create non-Windows processes directly –OS2.EXE runs only on Intel systems –Multiple MS-DOS apps may share virtual DOS machine –.BAT of.CMD files are interpreted by CMD.EXE –Win16 apps may share virtual dos machine (VDM) Flags: CREATE_SEPARATE_WOW_VDM, CREATE_SHARED_WOW_VDM Default: HKLM\System...\Control\WOW\DefaultSeparateVDM –Sharing of VDM only if apps run on same desktop under same security

126 126 If executable has no Windows format...  Debugger may be specified under (run instead of app !!) –\Software\Microsoft\WindowsNT\CurrentVersion\ImageFileExecu tionOptions

127 127 Process Creation - next Steps...  CreateProcess has opened Windows executable and created a section object to map in proc‘s addr space  Now: create executive process object via NtCreateProcess –Set up EPROCESS block –Create initial process address space (page directory, hyperspace page, working set list) –Create kernel process block (set inital quantum) –Conclude setup of process address space VM, map NTDLL.DLL, map lang support tables, register process: PsActiveProcessHead –Set up Process Environment Block –Complete setup of executive process object

128 128 Further Steps...(contd.)  Create Initial Thread and Its Stack and Context –NtCreateThread; –new thread is suspended until CreateProcess returns  Notify Windows Subsystem about new process KERNEL32.DLL sends message to Windows subsystem including: –Process and thread handles –Entries in creation flags –ID of process‘s creator –Flag describing Windows app (CSRSS may show startup cursor)

129 129 Further Steps...(contd.)  Windows: duplicate handles (inc usage count), set priority class, bookkeeping –allocate CSRSS proc/thread block, init exception port, init debug port –Show cursor (arrow & hourglass), wait 2 sec for GUI call, then wait 5 sec for window

130 130 CreateProcess: final steps Process Initialization in context of new process:  Lower IRQL level (dispatch -> Async.Proc.Call. level)  Enable working set expansion  Queue APC to exec LdrInitializeThunk in NTDLL.DLL  Lower IRQL level to 0 – APC fires, –Init loader, heap manager, NLS tables, –TLS array, critical section, structures –Load DLLs, call DLL_PROCESS_ATTACH func

131 131 CreateProcess: Final Steps  Debuggee: all threads are suspended –Send msg to proc‘s debug port Windows creates CREATE_PROCESS_DEBUG_INFO event  Image begins execution in user-mode (return from trap)

132 132  DLL notification - unless TerminateProcess used 2. All handles to executive and kernel objects are closed 3. Terminate any active threads 4. exit code changes from STILL_ACTIVE to the specified exit code: BOOL GetExitCodeProcess( HANDLE hProcess, LPDWORD lpdwExitCode); 5. Process object & thread objects become signaled 6. When handle and reference counts to process object == 0, process object is deleted Process Rundown Sequence

133 Thread Startup (in-context thread init.) Lower IRQL to APC Enable working set expansion Queue user-mode APC to run LdrInitializeThunk And lower IRQL to 0 Perform in-process context initialization (init loader, load DLLs) Process has debugger? Suspend all threads Send new thread message to subsystem Resume all threads Notify debugger process of new process and wait for reply Restore trap frame and dismiss exception Begin execution in user mode LPC send/ receive APC fires yes no User mode Inside CSRSS Kernel mode

134 134  DLL notification- unless TerminateThread was used 2. All handles to Windows User and GDI objects are closed 3. Outstanding I/Os are cancelled 4. Thread stack is deallocated 5. exit code changes from STILL_ACTIVE to the specified exit code BOOL GetExitCodeThread( HANDLE hThread, LPDWORD lpdwExitCode); 6. Thread kernel object becomes signaled 7. When handle and reference counts == 0, thread object deleted 8. If last thread in process, process exits Thread Rundown Sequence

135 135 Start of Thread Wrapper  All threads in all processes appear to have one of just two different start addresses, regardless of.EXE running –One for thread 0 (start of process wrapper) –the other for all other threads (start of thread wrapper)  These “wrapper” functions are what Process Viewer shows as Thread Start Address for Windows apps

136 136 Start of Thread Wrapper  Start of process and start of thread wrappers have same behavior –Provides default exception handling, access to debugger, etc. –Forces thread exit when thread function returns  To find “real” Windows start address, use TLIST (or Kernel Debugger !thread command)

137 137 void BaseProcessStart [or BaseThreadStart - basically the same] (LPTHREAD_START_ROUTINE lpStartAddr, LPVOID lpvThreadParm) { __try { DWORD dwThreadExitCode = lpStartAddr(lpvThreadParm); ExitThread(dwThreadExitCode); } __except(UnhandledExceptionFilter( GetExceptionInformation())) { ExitProcess(GetExceptionCode()); } } Start of Process/Thread Function (conceptual model)

138 138 if process has a debugger attached return EXCEPTION_CONTINUE_SEARCH if AUTO=0 {// run debugger automatically? Display message box;// no - ask user what to do if(clicked OK) ExitProcess(); } // either AUTO=1, or (AUTO=0 and user clicked CANCEL), // so run debugger GetProfileString("AEdebug","debugger",...); hEvent = CreateEvent(... ); hProcess = CreateProcess(...); // Create debugger - pass process id, event to signal WaitForMultipleObjects( [hEvent, hProcess] ); return EXCEPTION_CONTINUE_SEARCH; Windows Unhandled Exception Filter

139 139 Windows Unhandled Exception Filter  Implication: you can connect a debugger (VC++ or WinDbg) to a running process –C:\> msdev -p pid

140 140 Process Crashes (Windows 2000)  Registry defines behavior for unhandled exceptions –HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug –Debugger=filespec of debugger to run on app crash –Auto 1=run debugger immediately 0=ask user first

141 141 Process Crashes (Windows 2000)  Default on retail system is Auto=1; Debugger=DRWTSN32.EXE  Default with VC++ is Auto=0, Debugger=MSDEV.EXE

142 142  On XP & Server 2003, when an unhandled exception occurs: –System first runs DWWIN.EXE DWWIN creates a process microdump and XML file and offers the option to send the error report –Then runs debugger (default is Drwtsn32.exe) Process Crashes (XP & Server 2003)

143 143 Windows Error Reporting  Configurable with System Properties->Advanced- >Error Reporting –HKLM\SOFTWARE\Microsof t\PCHealth\ErrorReporting  Configurable with group policies –HKLM\SOFTWARE\Policies\ Microsoft\PCHealth

144 144 Scheduling Criteria  CPU utilization – keep the CPU as busy as possible  Throughput – # of processes/threads that complete their execution per time unit  Turnaround time – amount of time to execute a particular process/thread  Waiting time – amount of time a process/thread has been waiting in the ready queue  Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (i.e.; the hourglass)

145 145 Windows Scheduler  Priority-driven, preemptive scheduling system  Highest-priority runnable thread always runs  Thread runs for time amount of quantum  No single scheduler – event-based scheduling code spread across the kernel

146 146 Windows Scheduler  Dispatcher routines triggered by the following events: –Thread becomes ready for execution –Thread leaves running state (quantum expires, wait state) –Thread‘s priority changes (system call/NT activity) –Processor affinity of a running thread changes

147 147 Windows Scheduling Principles  32 priority levels  Threads within same priority are scheduled Round-Robin  Non-real-time priorities are adjusted dynamically –Priority elevation as response to certain I/O and dispatch –Quantum stretching to optimize responsiveness  Real-time priorities are assigned statically to threads

148 148 Scheduling  Multiple threads may be ready to run  “Who gets to use the CPU?”  From Windows API point of view:  Processes are given a priority class upon creation –Idle, Normal, High, Realtime –Windows 2000 added “Above normal” and “Below normal”  Threads have a relative priority within the class –Idle, Lowest, Below_Normal, Normal, –Above_Normal, Highest, and Time_Critical

149 149 Windows Scheduling-related APIs: Get/SetPriorityClass Get/SetThreadPriority Get/SetProcessAffinityMask SetThreadAffinityMask SetThreadIdealProcessor Suspend/ResumeThread Scheduling  From the kernel’s view: –Threads have priorities 0 through 31 –Threads are scheduled, not processes –Priority class is not used to make schedule decisions

150 150 Kernel: Thread Priority Levels 16 “real-time” levels 15 variable levels Used by zero page thread Used by idle thread(s) i 0 i 15 1

151 151 Windows vs. NT Kernel Priorities

152 152 Windows vs. NT Kernel Priorities  Table shows base priorities –current or dynamic thread priority may be higher if base <15  Many utilities (such as Process Viewer) show the “dynamic priority” of threads rather than the base –Performance Monitor can show both  Drivers can set to any value with KeSetPriorityThread

153 153 Special Thread Priorities  Idle threads -- one per CPU  When no threads want to run, Idle thread “runs” –Not a real priority level - appears to have priority zero, but actually runs “below” priority 0 –Provides CPU idle time accounting (unused clock ticks are charged to the idle thread)  Loop: –Calls HAL to allow for power management –Processes DPC list; Dispatches to a thread if selected  Server 2003: –in certain cases, scans per-CPU ready queues for next thread

154 154 Special Thread Priorities  Zero page thread -- one per NT system –Zeroes pages of memory in anticipation of “demand zero” page faults –Runs at priority zero (lower than any reachable from Windows) –Part of the “System” process (not a complete process)

155 155 Thread Scheduling Priorities vs. Interrupt Request Levels (IRQLs) Passive_Level APC Dispatch/DPC Device 1... Device n Clock Interprocessor Interrupt Power fail High Hardware interrupts IRQLs Software interrupts Thread priorities 0-31

156 156  Priority driven, preemptive –32 queues (FIFO lists) of “ready” threads –UP: highest priority thread always runs –MP: One of the highest priority runnable thread will be running somewhere –No attempt to share processor(s) “fairly” among processes, only among threads Time-sliced, round-robin within a priority level Single Processor Thread Scheduling

157 157  Event-driven: –no guaranteed execution period before preemption –When a thread becomes Ready, it either runs immediately or is inserted at the tail of the Ready queue for its current (dynamic) priority Single Processor Thread Scheduling

158 158 Thread Scheduling  No central scheduler! –there is no always-instantiated routine called “scheduler”  The “code that does scheduling” is not a thread  Scheduling routines are simply called whenever events occur that change the Ready state of a thread

159 159 Thread Scheduling  Things that cause scheduling events include: –interval timer interrupts (for quantum end) –interval timer interrupts (for timed wait completion) –other hardware interrupts (for I/O wait completion) –one thread changes the state of a waitable object upon which other thread(s) are waiting –a thread waits on one or more dispatcher objects –a thread priority is changed

160 160 Thread Scheduling  Based on doubly-linked lists (queues) of Ready threads –Nothing that takes “order-n time” for n threads

161 161 Scheduling Data Structures Process thread Process thread Default base prio Default proc affinity Default quantum 31 0 Ready summaryIdle summary Base priority Current priority Processor affinity Quantum Bitmask for non-empty ready queues Bitmask for idle CPUs

162 162 Scheduling Scenarios  Preemption –A thread becomes Ready at a higher priority than the running thread –Lower-priority Running thread is preempted –Preempted thread goes back to head of its Ready queue action: pick lowest priority thread to preempt  Voluntary switch –Waiting on a dispatcher object –Termination –Explicit lowering of priority action: scan for next Ready thread starting at your priority & down)

163 163 Scheduling Scenarios  Running thread experiences quantum end –Priority is decremented unless already at base priority –Thread goes to tail of ready queue for its new priority –May continue running if no equal or higher-priority threads are Ready action: pick next thread at same priority level

164 Running Ready from Wait state Scheduling Scenarios Preemption  Preemption is strictly event-driven –does not wait for the next clock tick –no guaranteed execution period before preemption –threads in kernel mode may be preempted (unless they raise IRQL to >= 2)

165 Running Ready from Wait state Scheduling Scenarios: Ready after Wait  If newly-ready thread is no higher than running thread… –it is put at tail of ready queue for its current priority –If priority >=14 quantum is reset (t.b.d.) –If priority <14 and you’re about to be boosted and didn’t already have a boost, quantum is set to process quantum - 1

166 166 Scheduling Scenarios: Voluntary Switch to Waiting state Running Ready  When the running thread gives up the CPU… –Schedule the thread at head of next non-empty “ready” queue

167 167 Scheduling Scenarios: Quantum End (“time-slicing”)  When the running thread exhausts its CPU quantum, it goes to the end of its ready queue  Applies to both real-time and dynamic priority threads, user and kernel mode –Quantums can be disabled for a thread by a kernel function  Default quantum on Professional is 2 clock ticks, 12 on Server –standard clock tick is 10 msec; –might be 15 msec on some MP Pentium systems  if no other ready threads at that priority, same thread continues running (just gets new quantum)  if running at boosted priority, priority decays by one at quantum end (described later)

168 168 Scheduling Scenarios: Quantum End (“time-slicing”) Running Ready

169 169 Basic Thread Scheduling States Ready (1)Running (2) Waiting (5) voluntary switch preemption, quantum end

170 170 Watching Scheduling  CPUSTRES.EXE - Creating a Test Case  Run: cpustres.exe (Resource Kit)

171 171 Watching the Scheduler Performance Monitor - Threads Object Screen snapshot from: Programs | Admin. Tools | Performance Monitor select “Add to Chart”, and Object: Thread. use Ctrl-leftClick to select multiple items in a selection box

172 172 Watching the Scheduler Performance Monitor - Options | Chart Screen snapshot from: Performance Monitor Options menu | Chart command Set chart maximum vertical scale to 16 Set update interval to 0.1 seconds or less

173 173 Watching the Scheduler Performance Monitor Screen snapshot from: PerfMon main window, setup from previous slide Thread states are indicated by numbers (see thread state transition diagram on previous slide, or Perfmon Explain display for Thread State counter) 5 = waiting 2 = running 1 = ready

174 174 Priority Adjustments  Dynamic priority adjustments (boost and decay) are applied to threads in “dynamic” classes –Threads with base priorities 1-15 (technically, 1 through 14) –Disable if desired with SetThreadPriorityBoost or SetProcessPriorityBoost  Five types: –I/O completion –Wait completion on events or semaphores –When threads in the foreground process complete a wait –When GUI threads wake up for windows input –For CPU starvation avoidance

175 175 Priority Adjustments  No automatic adjustments in real-time class (16 or above)  Real time here really means “system won’t change the relative priorities of your real-time threads”  Hence, scheduling is predictable with respect to other “real- time” threads (but not for absolute latency)

176 176 To favor I/O intense threads:  After an I/O: specified by device driver –IoCompleteRequest( Irp, PriorityBoost ) Common boost values (see NTDDK.H) 1: disk, CD-ROM, parallel, Video 2: serial, network, named pipe, mailslot 6: keyboard or mouse 8: sound Priority Boosting

177 177  Other cases discussed in WIN Scheduling Internals Section –After a wait on executive event or semaphore –After any wait on a dispatcher object by a thread in the foreground process –GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2 Priority Boosting

178 178 Thread Priority Boost and Decay  Behavior of these boosts: –Applied to thread’s base priority will not take you above priority 15 –After a boost, you get one quantum Then decays 1 level, runs another quantum

179 179 Priority Base Priority RunWaitRun Preempt (before quantum end) Run Priority decay at quantum end Boost upon wait complete Round-robin at base priority quantum Time Thread Priority Boost and Decay

180 180 Thread Scheduling States (2000, XP) Ready (1)Running (2) Waiting (5) Ready = thread eligible to be scheduled to run Standby = thread is selected to run on CPU voluntary switch preemption, quantum end Init (0) Terminate (4) Transition (6) wait resolved after kernel stack made pageable Standby (3) preempt

181 181 Other Thread States  Transition –Thread was in a wait entered from user mode for 12 seconds or more –System was short on physical memory –Balance set manager (t.b.d.) marked the thread’s kernel stack as pageable (preparatory to “outswapping” the thread’s process) –Later, the thread’s wait was satisfied, but... –...Thread can’t become Ready until the system allocates a nonpageable kernel stack; it is in the “transition” state until then  Initiate –Thread is “under construction” and can’t run yet  Standby –One processor has selected a thread for execution on another processor  Terminate –Thread has executed its last code, but can’t be deleted until all handles and references to it are closed (object manager)

182 182 Scheduling Scenarios: Quantum Details  Quantum internally stored as “3 * number of clock ticks” –Default quantum is 6 on Professional, 36 on Server  Thread->Quantum field is decremented by 3 on every clock tick  Process and thread objects have a Quantum field –Process quantum is simply used to initialize thread quantum for all threads in the process  Quantum decremented by 1 when you come out of a wait –So that threads that get boosted after I/O completion won't keep running and never experiencing quantum end –Prevents I/O bound threads from getting unfair preference over CPU bound threads

183 183 Scheduling Scenarios: Quantum Details  When Thread->Quantum reaches zero (or less than zero): –you’ve experienced quantum end –Thread->Quantum = Process->Quantum;// restore quantum –for dynamic-priority threads, this is the only thing that restores the quantum –for real-time threads, quantum is also restored upon preemption  Interval timer interrupts when previous IRQL >= 2: –are not charged to the current thread’s “privileged” time –but do cause the thread “remaining quantum” counter to be decremented

184 184 Quantum Stretching  Favoring foreground applications  If normal-priority process owns the foreground window, its threads may be given longer quantum –Set by Control Panel / System applet / Performance tab –Stored in…\System\CurrentControlSet\Control\PriorityControl Win32PrioritySeparation = 0, 1, or 2 –New behavior with 4.0 formerly implemented via priority shift

185 185 Quantum Stretching Screen snapshot from: Control Panel | System | Performance tab

186 186 Quantum Stretching  Resulting quantum: –“Maximum” = 6 ticks –(middle) = 4 ticks –“None” = 2 ticks  Quantum stretching does not happen on Server –Quantum on Server is always 12 ticks 8 Running Ready

187 187  As of Windows 2000, can choose short or long quantums (e.g. for Terminal Services) –NT Server 4.0 was always the same, regardless of slider bar Screen snapshot from: Control Panel | System | Advanced tab | Performance Windows 2000: XP: Quantum Selection

188 188  Finer grained quantum control can be achieved by modifying –HKLM\System\CurrentControlSet\Control \PriorityControl\Win32PrioritySeparation –6 bit value Short vs. LongQuantum BoostVariable vs. Fixed 024 Quantum Control

189 189  Short vs. Long 0,3default (short for Pro, long for Server) 1long 2short  Variable vs. Fixed 0,3default (yes for Pro, no for Server) 1yes 2no  Quantum Boost 0fixed (overrides above setting) 1double quantum of foreground threads 2,3triple quantum of foreground threads Quantum Control

190 190 Controlling Quantum with Jobs Scheduling class Quantum units  If a process is a member of a job, quantum can be adjusted by setting the “Scheduling Class” –Only applies if process is higher then Idle priority class –Only applies if system running with fixed quantums (the default on Servers)  Values are 0-9 –5 is default

191 191 Common boost values (see NTDDK.H) 1: disk, CD-ROM, parallel, Video 2: serial, network, named pipe, mailslot 6: keyboard or mouse 8: sound  After an I/O: specified by device driver –IoCompleteRequest( Irp, PriorityBoost )  After a wait on executive event or semaphore –Boost value of 1 is used for these objects –Server 2003: for critical sections and pushlocks: Waiting thread is boosted to 1 more than setting thread’s priority (max boost is to 13) Setting thread loses boost (lock convoy issue) Priority Boosting

192 192  After any wait on a dispatcher object by a thread in the foreground process: –Boost value of 2 XP/2003: boost is lost after one full quantum –Goal: improve responsiveness of interactive apps  GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2 –This is added to the current, not base priority –Goal: improve responsiveness of interactive apps Priority Boosting

193 193 Lab: Foreground Priority Boosts  See Book “EXPERIMENT: Watching Foreground Priority Boosts and Decays”, p.351  See Book “EXPERIMENT: Watching Priority Boosts on GUI Threads”, p.353

194 194 CPU Starvation Avoidance  Balance Set Manager (sys thread) looks for starved threads –This is a thread, running at priority 16 –Wakes up once per second and examines Ready queues –Looks for threads that have been Ready for 300 clock ticks approximate 4 seconds on a 10ms clock –Attempts to resolve “priority inversions” (high priority thread (12 in diagram) waits on something locked by a lower thread (4), which can’t run because of a middle priority CPU-bound thread (7)), but not deterministically (no priority inheritance) Wait Run Ready

195 195  Priority is boosted to 15 (14 prior to NT 4 SP3) –Quantum is doubled on Win2000/XP and set to 4 on 2003 –At quantum end, returns to previous priority (no gradual decay) and normal quantum  Scans up to 16 Ready threads per priority level each pass  Boosts up to 10 Ready threads per pass  Like all priority boosts, does not apply in the real-time range (priority 16 and above) CPU Starvation Avoidance

196 196 Lab: CPU Starvation Resolution  See Book EXPERIMENT: Watching Priority Boosts for CPU Starvation, p.355 –CpuStres with two compute-bound threads (“maximum” activity level) –One is at lower priority than the other  See Book EXPERIMENT: “Listening to Priority Boosting”, p.357

197 197 Multiprocessor Scheduling  Threads can run on any CPU, unless specified otherwise –Tries to keep threads on same CPU (“soft affinity”) –Setting of which CPUs a thread will run on is called “hard affinity”  Fully distributed (no “master processor”) –Any processor can interrupt another processor to schedule a thread  Scheduling database: –Pre-Windows Server 2003: single system-wide list of ready queues –Windows Server 2003: per-CPU ready queues

198 198 Hard Affinity  Affinity is a bit mask where each bit corresponds to a CPU number –Hard Affinity specifies where a thread is permitted to run Defaults to all CPUs –Thread affinity mask must be subset of process affinity mask, which in turn must be a subset of the active processor mask

199 199 Hard Affinity  Functions to change: –SetThreadAffinityMask, SetProcessAffinityMask, SetInformationJobObject  Tools to change: –Task Manager or Process Explorer Right click on process and choose “Set Affinity” –Psexec -a

200 200 Hard Affinity  Can also set an image affinity mask –See “Imagecfg” tool in Windows 2000 Server Resource Kit Supplement 1 E.g. Imagecfg –a 2 xyz.exe will run xyz on CPU 1  Can also set “uniprocessor only”: sets affinity mask to one processor –Imagecfg –u xyz.exe –System chooses 1 CPU for the process Rotates round robin at each process creation –Useful as temporary workaround for multithreaded synchronization bugs that appear on MP systems

201 201 Hard Affinity  NOTE: Setting hard affinity can lead to threads’ getting less CPU time than they normally would –More applicable to large MP systems running dedicated server apps –Also, OS may in some cases run your thread on CPUs other than your hard affinity setting (flushing DPCs, setting system time) Thread “system affinity” vs “user affinity”

202 202  Every thread has an “ideal processor” –System selects ideal processor for first thread in process (round robin across CPUs) –Next thread gets next CPU relative to the process seed –Can override with: SetThreadIdealProcessor ( HANDLE hThread,// handle to the thread to be changed DWORD dwIdealProcessor);// processor number Soft Processor Affinity

203 203  Hard affinity changes update ideal processor settings  Used in selecting where a thread runs next  For Hyperthreaded systems, new Windows API in Server 2003 to allow apps to optimize –GetLogicalProcessorInformation  For NUMA systems, new APIs to allow apps to optimize: –Use GetProcessAffinityMask to get list of processors Then GetNumaProcessorNode to get node # for each CPU –Or call GetNumaHighestNodeNumber and then GetNumaNodeProcessorMask to get processor #s for each node Soft Processor Affinity

204 204 MP Systems Only 0 Process Thread 1Thread 2Thread 3Thread 4 31 Ready Queues Ready Summary 310 Idle Summary Mask 310 Process Active Processor Mask 310 Windows 2000/XP Dispatcher Database

205 205 Choosing a CPU for a Ready Thread (Windows 2000)  When a thread becomes ready to run (e.g. its wait completes, or it is just beginning execution), need to choose a processor for it to run on  First, it sees if any processors are idle that are in the thread’s hard affinity mask: –If its “ideal processor” is idle, it runs there –Else, if the previous processor it ran on is idle, it runs there –Else if the current processor is idle, it runs there –Else it picks the highest numbered idle processor in the thread’s affinity mask

206 206 Choosing a CPU for a Ready Thread (Windows 2000)  If no processors are idle: –If the ideal processor is in the thread’s affinity mask, it selects that –Else if the the last processor is in the thread’s affinity mask, it selects that –Else it picks the highest numbered processor in the thread’s affinity mask  Finally, it compares the priority of the new thread with the priority of the thread running on the processor it selected (if any) to determine whether or not to perform a preemption

207 207 Selecting a Thread to Run on a CPU (Windows 2000)  System needs to choose a thread to run on a specific CPU at: –At quantum end –When a thread enters a wait state –When a thread removes its current processor from its hard affinity mask –When a thread exits  Starting with the first thread in the highest priority non-empty ready queue, it scans the queue for the first thread that has the current processor in its hard affinity mask and: –Ran last on the current processor, or –Has its ideal processor equal to the current processor, or –Has been in its Ready queue for 3 or more clock ticks, or –Has a priority >=24

208 208 Selecting a Thread to Run on a CPU (Windows 2000)  If it cannot find such a candidate, it selects the highest priority thread that can run on the current CPU (whose hard affinity includes the current CPU) –Note: this may mean going to a lower priority ready queue to find a candidate

209 209 0 Process Thread 1Thread 2 Thread 3Thread 4 31 CPU 0 Ready Queues Ready Summary 310 Process 0 31 CPU 1 Ready Queues Ready Summary 310 Deferred Ready Queue Windows Server 2003 Dispatcher Database

210 210 Server 2003 Enhancements  Threads always go into the ready queue of their ideal processor  Instead of locking the dispatcher database to look for a candidate to run, per-CPU ready queue is checked first (first grabs PRCB spinlock) –If a thread has been selected to run on the CPU, does the context swap –Else begins scan of other CPU’s ready queues looking for a thread to run This scan is done OUTSIDE the dispatcher lock Just acquires CPU PRCB lock

211 211 Server 2003 Enhancements  Dispatcher lock still acquired to wait or unwait a thread and/or change state of a dispatcher object  Bottom line: dispatcher lock is now held for a MUCH shorter time

212 212 Deferred Ready (7) Running (2) Waiting (5) voluntary switch preemption, quantum end Init (0) Terminate (4) Transition (6) Standby (3) preempt Ready (1) Thread Scheduling States (Server 2003)

213 213 Server 2003 Enhancements  Idle processor selection further refined to:  NUMA system: –if there are idle CPUs in the node containing the thread’s ideal processor, reduce to that set  hyperthreaded system: –if one of the idle processors is a physical processor with all logical processors idle, reduce to that set  Then try to eliminate idle CPUs that are sleeping  If thread ran last on a member of the set, pick that CPU –Else pick lowest numbered CPU in remaining set

214 214 Affinity Collisions CPU 1CPU 0 Thread A: Current priority 4 Affinity mask 10 Thread B: Current priority 8 Affinity mask 11 Thread C: Current priority 6 Affinity mask 01  Highest-priority n threads may not be running if thread affinity interferes  NT guarantees the highest-priority thread will be Running –But lower-priority n-1 Ready threads may not be… –because scheduler will not move running threads among CPUs  Example: Threads became Ready in order A, B, C

215 Thoughts Change Life 意念改变生活


Download ppt "Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒."

Similar presentations


Ads by Google