Presentation is loading. Please wait.

Presentation is loading. Please wait.

Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.3. Windows I/O Processing.

Similar presentations

Presentation on theme: "Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.3. Windows I/O Processing."— Presentation transcript:

1 Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.3. Windows I/O Processing

2 2 Copyright Notice © 2000-2005 David A. Solomon and Mark Russinovich These materials are part of the Windows Operating System Internals Curriculum Development Kit, developed by David A. Solomon and Mark E. Russinovich with Andreas Polze Microsoft has licensed these materials from David Solomon Expert Seminars, Inc. for distribution to academic organizations solely for use in academic environments (and not for commercial use)

3 3 Roadmap for Section 6.3 Driver and Device Objects I/O Request Packets (IRP) Processing Driver Layering and Filtering Plug-and-Play (PnP) and Power Manager Operation Monitoring I/O Activity with Filemon

4 4 Driver Object A driver object represents a loaded driver Names are visible in the Object Manager namespace under \Drivers A driver fills in its driver object with pointers to its I/O functions e.g. open, read, write When you get the One or More Drivers Failed to Start message its because the Service Control Manager didnt find one or more driver objects in the \Drivers directory for drivers that should have started

5 5 Device Objects A device object represents an instance of a device Device objects are linked in a list off the driver object A driver creates device objects to represent the interface to the logical device, so each generally has a unique name visible under \Devices Device objects point back at the Driver object

6 6 Driver and Device Objects \TCPIP Driver Object \Device\TCP\Device\UDP\Device\IP Dispatch Table Open Write Read Loaded Driver Image Open(…) Read(…) Write(…) TCP/IP Drivers Driver and Device Objects

7 7 File Objects Represents open instance of a device (files on a volume are virtual devices) Applications and drivers open devices by name The name is parsed by the Object Manager When an open succeeds the object manager creates a file object to represent the open instance of the device and a file handle in the process handle table A file object links to the device object of the device which is opened File objects store additional information File offset for sequential access File open characteristics (e.g. delete-on-close) File name Accesses granted for convenience

8 8 I/O Request Packets System services and drivers allocate I/O request packets to describe I/O A request packet contains: File object at which I/O is directed I/O characteristics (e.g. synchronous, non-buffered) Byte offset Length Buffer location The I/O Manager locates the driver to which to hand the IRP by following the links: File Object Device Object Driver Object

9 9 Putting it Together: Request Flow Process NtDeviceIoControlFile Handle Table File Object Device Object Driver Object Dispatch Table DispatchDeviceControl( DeviceObject, Irp ) Driver Code User Mode Kernel Mode IRP DeviceIoControl

10 10 I/O Request Packet Environment subsystem or DLL Services I/O manager IRP header WRITE parameters File object Device object Driver object IRP stack location Dispatch routine(s) Start I/OISR DPC routine Device Driver 1) 1)An application writes a file to the printer, passing a handle to the file object 2)The I/O manager creates an IRP and initializes first stack location 3)The I/O manager uses the driver object to locate the WRITE dispatch routine and calls it, passing the IRP User mode Kernel mode

11 11 IRP data IRP consists of two parts: Fixed portion (header): Type and size of the request Whether request is synchronous or asynchronous Pointer to buffer for buffered I/O State information (changes with progress of the request) One or more stack locations: Function code Function-specific parameters Pointer to callers file object While active, IRPs are stored in a thread-specific queue I/O system may free any outstanding IRPs if thread terminates

12 12 I/O Processing – synch. I/O to a single-layered driver 1. The I/O request passes through a subsystem DLL 2. The subsystem DLL calls the I/O managers NtWriteFile() service 3. I/O manager sends the request in form of an IRP to the driver (a device driver) 4. The driver starts the I/O operation 5. When the device completes the operation and interrupts the CPU, the device driver services the int. 6. The I/O manager completes the I/O request

13 13 Completing an I/O request Servicing an interrupt: ISR schedules Deferred Procedure Call (DPC); dismisses int. DPC routine starts next I/O request and completes interrupt servicing May call completion routine of higher-level driver I/O completion: Record the outcome of the operation in an I/O status block Return data to the calling thread – by queuing a kernel-mode Asynchronous Procedure Call (APC) APC executes in context of calling thread; copies data; frees IRP; sets calling thread to signaled state I/O is now considered complete; waiting threads are released

14 14 Flow of Interrupts Peripheral Device Controller CPU Interrupt Controller CPU Interrupt Service Table 0 2 3 n ISR Address Spin Lock Dispatch Code Interrupt Object Read from device Acknowledge- Interrupt Request DPC Driver ISR Raise IRQL Lower IRQL KiInterruptDispatch Grab Spinlock Drop Spinlock

15 15 Servicing an Interrupt: Deferred Procedure Calls (DPCs) Used to defer processing from higher (device) interrupt level to a lower (dispatch) level Also used for quantum end and timer expiration Driver (usually ISR) queues request One queue per CPU. DPCs are normally queued to the current processor, but can be targeted to other CPUs Executes specified procedure at dispatch IRQL (or dispatch level, also DPC level) when all higher-IRQL work (interrupts) completed Maximum times recommended: ISR: 10 usec, DPC: 25 usec See queue head DPC object

16 16 DPC Delivering a DPC DPC routines can call kernel functions but cant call system services, generate page faults, or create or wait on objects DPC routines cant assume what process address space is currently mapped Interrupt dispatch table high Power failure Dispatch/DPC APC Low DPC 1. Timer expires, kernel queues DPC that will release all waiting threads Kernel requests SW int. DPC DPC queue 2. DPC interrupt occurs when IRQL drops below dispatch/DPC level dispatcher 3. After DPC interrupt, control transfers to thread dispatcher 4. Dispatcher executes each DPC routine in DPC queue

17 17 I/O Completion: Asynchronous Procedure Calls (APCs) Execute code in context of a particular user thread APC routines can acquire resources (objects), incur page faults, call system services APC queue is thread-specific User mode & kernel mode APCs Permission required for user mode APCs Executive uses APCs to complete work in thread space Wait for asynchronous I/O operation Emulate delivery of POSIX signals Make threads suspend/terminate itself (env. subsystems) APCs are delivered when thread is in alertable wait state WaitForMultipleObjectsEx(), SleepEx()

18 18 Asynchronous Procedure Calls (APCs) Special kernel APCs Run in kernel mode, at IRQL 1 Always deliverable unless thread is already at IRQL 1 or above Used for I/O completion reporting from arbitrary thread context Kernel-mode interface is linkable, but not documented Ordinary kernel APCs Always deliverable if at IRQL 0, unless explicitly disabled (disable with KeEnterCriticalRegion) User mode APCs Used for I/O completion callback routines (see ReadFileEx, WriteFileEx); also, QueueUserApc Only deliverable when thread is in alertable wait Thread Object K U APC objects

19 19 Driver Layering and Filtering To divide functionality across drivers, provide added value, etc. Only the lowest layer talks to the I/O hardware Filter drivers attach their devices to other devices They see all requests first and can manipulate them Example filter drivers: File system filter driver Bus filter driver Process User Mode Kernel Mode System Services I/O Manager File System Driver Volume Manager Driver Disk Driver IRP

20 20 Driver Filtering: Volume Shadow Copy New to XP/Server 2003 Addresses the backup open files problem Volumes can be snapshotted Allows hot backup (including open files) Applications can tie in with mechanism to ensure consistent snapshots Database servers flush transactions Windows components such as the Registry flush data files Different snapshot providers can implement different snapshot mechanisms: Copy-on-writeMirroring Volsnap is the built-in provider: Built into Windows XP/Server 2003 Implements copy-on-write snapshots Saves volume changes in files on the volume Uses defrag API to determine where the file is and where paging file is to avoid tracking their changes

21 21 Volume Shadow Copy Driver (volsnap.sys) Volume Shadow Copy Driver (volsnap.sys) Mirror provider Oracle SQL Volume Shadow Copy Service Volume Shadow Copy Service Backup Application Backup Application 1. 1.Backup application requests shadow copy 2. Writers told to freeze activity 3. Providers asked to create volume shadow copies 4. Writers told to resume (thaw) activity Writers Providers 5. Backup application saves data from volume Shadow copies Volume Snapshots

22 22 Volsnap.sys File System Driver Volsnap.sys abc ad…bc Backup read of sector c Snapshot Backup Application Application Shadow Volume C: Application read of sector c All reads of sector d

23 23 Shadow Copies of Shared Folders When enabled, Server 2003 uses shadow copy to periodically create snapshots of volumes Schedule and space used is configurable

24 24 Shadow Copies on Shared Folders Shadow copies are only exposed as network shares Clients may install an Explorer extension that integrates with the file server and lets them View the state of folders and files within a snapshot Rollback individual folders and files to a snapshot

25 25 The PnP Manager In NT 4.0 each device driver is responsible for enumerating all supported busses in search of devices they support As of Windows 2000, the PnP Manager has bus drivers enumerate their busses and inform it of present devices If the device driver for a device not already present on the system, the PnP Manager in the kernel informs the user-mode PnP Manager to start the Hardware Wizard

26 26 The PnP Manager Once a device driver is located, the PnP Manager determines if the driver is signed If the driver is not signed, the systems driver signing policy determines whether or not the driver is installed If the driver is not signed, the systems driver signing policy determines whether or not the driver is installed After loading a driver, the PnP Manager calls the drivers AddDevice entry point The driver informs the PnP Manager of the devices resource requirements The PnP Manager reconfigures other devices to accommodate the new device

27 27 The PnP Manager Enumeration is recursive, and directed by bus drivers Bus drivers identify device on a bus As busses and devices are registered, a device tree is constructed, and filled in with devices Root ACPI PCI USB Video Disk Key- board Battery Device Tree

28 28 Resource Arbitration Devices require system hardware resources to function (e.g. IRQs, I/O ports) The PnP Manager keeps track of hardware resource assignments If a device requires a resource thats already been assigned, the PnP Manager tries to reassign resources in order to accommodate Example: 1. Device 1 can use IRQ 5 or IRQ 6 2. PnP Manager assigns it IRQ 5 3. Device 2 can only use IRQ 5 4. PnP Manager reassigns Device 1 IRQ 6 5. PnP Manager assigns Device 2 IRQ 5

29 29 Plug and Play (PnP) State Transitions PnP manager recognizes hardware, allocates resources, loads driver, notifies about config. changes Not started Started Pending stop Stopped Pending remove Surprise remove Removed Start-device command Query-stop command Stop command Query-remove command Surprise-remove command Remove command Device Plug and Play state transitions

30 30 The Power Manager A system must have an ACPI-compliant BIOS for full compatibility (APM gives limited power support) A number of factors guide the Power Managers decision to change power state: System activity level System battery level Shutdown, hibernate, or sleep requests from applications User actions, such as pressing the power button Control Panel power settings The system can go into low power modes, but it requires the cooperation of every device driver - applications can provide their input as well

31 31 The Power Manager There are different system power states: On Everything is fully on Standby Intermediate states Lower standby states must consume less power than higher ones Hibernating Save memory to disk in a file called hiberfil.sys in the root directory of the system volume Off All devices are off Device drivers manage their own power level Only a driver knows the capabilities of their device Some devices only have on and off, others have intermediate states Drivers can control their own power independently of system power Display can dim, disk spin down, etc.

32 32 Power Manager based on the Advanced Configuration and Power Interface (ACPI) State Power Consumption Software Resumption HW Latency S0 (fully on) Maximum Not applicable None S1 (sleeping) Less than S0, more than S2 System resumes where it left off (returns to S0) Less than 2 sec. S2 (sleeping) Less than S1, more than S3 System resumes where it left off (returns to S0) 2 or more sec. S3 (sleeping) Less than S2, processor is off System resumes where it left off (returns to S0) Same as S2 S4 (sleeping) Trickle current to power button and wake circuitry System restarts from hibernate file and resumes where it left off (returns to S0) Long and undefined S5 (fully off) Trickle current to power button System boot Long and undefined System Power-State Definitions

33 33 Troubleshooting I/O Activity Filemon can be a great help to understand and troubleshooting I/O problems Two basic techniques: Go to end of log and look backwards to where problem occurred or is evident and focused on the last things done Compare a good log with a bad log Often comparing the I/O activity of a failing process with one that works may point to the problem Have to first massage log file to remove data that differs run to run Delete first 3 columns (they are always different: line #, time, process id) Easy to do with Excel by deleting columns Then compare with FC (built in tool) or Windiff (Resource Kit)

34 34 Filemon # - operation number Process: image name + process id Request: internal I/O request code Result: return code from I/O operation Other: flags passed on I/O request

35 35 Using Filemon Start/stop logging (Control/E) Clear display (Control/X) Open Explorer window to folder containing file: Double click on a line does this Find – finds text within window Save to log file Advanced mode Network option

36 36 What Filemon Monitors By default Filemon traces all file I/O to: Local non-removable media Network shares Stores all output in listview Can exhaust virtual memory in long runs You can limit captured data with history depth You can limit what is monitored: What volumes to watch in Volumes menu What paths and processes to watch in Filter dialog What operations to watch in Filter dialog (reads, writes, successes and errors)

37 37 Filemon Filtering and Highlighting Include and exclude filters are substring matches against the process and path columns Exclude overrides include filter Be careful that you dont exclude potentially useful data Capture everything and save the log Then apply filters (you can always reload the log) Highlight matches all columns

38 38 Basic vs Advanced Mode Basic mode massages output to be sysadmin- friendly and target common troubleshooting Things you dont see in Basic mode: Raw I/O request names Various internal file system operations Activity in the System process Page file I/O Filemon file system activity

39 39 Understanding Disk Activity Use Filemon to see why youre hard disk is crunching Process performance counters show I/O activity, but not to where System performance counters show which disks are being hit, but not which files or which process Filemon pinpoints which file(s) are being accessed, by whom, and how frequently You can also use Filemon on a server to determine which file(s) were being accessed most frequently Import into Excel and make a pie chart by file name or operation type Move heavy-access files to a different disk on a different controller

40 40 Polling and File Change Notification Many applications respond to file and directory changes A poorly written application will poll for changes A well-written application will request notification by the system of changes Polling for changes causes performance degradation Context switches including TLB flush Cache invalidation Physical memory usage CPU usage Alternative: file change notification When you run Filemon on an idle system you should only see bursty system background activity Polling is visible as periodic accesses to the same files and directories File change notification is visible as directory queries that have no result

41 41 Example: Word Crash While typing in the document Word XP would intermittently close without any error message To troubleshoot ran Filemon on users system Set the history depth to 10,000 Asked user to send Filemon log when Word exited

42 42 Solution: Word Crash Working backwards, the first strange or unexplainable behavior are the constant reads past end of file to MSSP3ES.LEX User looked up what.LEX file was Related to Word proofing tools Uninstalled and reinstalled proofing tools & problem went away

43 43 Example: Useless Excel Error Message Excel reports an error Unable to read file" when starting

44 44 Solution: Useless Excel Error Message Filemon trace shows Excel reading file in XLStart folder All Office apps autoload files in their start folders Should have reported: Name and location of file Reason why it didnt like it

45 45 Further Reading Mark E. Russinovich and David A. Solomon, Microsoft Windows Internals, 4th Edition, Microsoft Press, 2004. I/O Processing (from pp. 561) The Plug and Play (PnP) Manager (from pp. 590) The Power Manager (from pp. 607) Troubleshooting File System Problems (from pp. 711)

46 46 Source Code References Windows Research Kernel sources \base\ntos\io – I/O Manager \base\ntos\inc\io.h – additional structure/type definitions \base\ntos\verifer – Driver Verifier \base\ntos\inc\verifier.h – additional structure/type definitions

Download ppt "Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.3. Windows I/O Processing."

Similar presentations

Ads by Google