1 UNIX Internals – The New Frontiers Device Drivers and I/O.

Slides:



Advertisements
Similar presentations
Process Description and Control
Advertisements

Chapter 6 I/O Systems.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 8: Main Memory.
Process Description and Control
Chapter 7 Constructors and Other Tools. Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 7-2 Learning Objectives Constructors Definitions.
Processes and Operating Systems
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Chapter 6 File Systems 6.1 Files 6.2 Directories
Chapter 4 Memory Management 4.1 Basic memory management 4.2 Swapping
1 Chapter 11 I/O Management and Disk Scheduling Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and.
1 Chapter 12 File Management Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design Principles,
1 Chapter 10 - Structures, Unions, Bit Manipulations, and Enumerations Outline 10.1Introduction 10.2Structure Definitions 10.3Initializing Structures 10.4Accessing.
Figure 12–1 Basic computer block diagram.
Chapter 5 : Memory Management
Chapter 12: File System Implementation
I/O Management and Disk Scheduling
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Device Drivers. Linux Device Drivers Linux supports three types of hardware device: character, block and network –character devices: R/W without buffering.
I/O and Networking Fred Kuhns
Slide 5-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 5 5 Device Management.
I/O Systems.
Chapter 17 Linked Lists.
Chapter 24 Lists, Stacks, and Queues
Chapter 4 Memory Management Basic memory management Swapping
Project 5: Virtual Memory
Page Replacement Algorithms
Chapter 3.3 : OS Policies for Virtual Memory
Chapter 3 Memory Management
Chapter 10: Virtual Memory
Virtual Memory II Chapter 8.
Memory Management.
Operating Systems Operating Systems - Winter 2010 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Chapter 4 Memory Management Page Replacement 补充:什么叫页面抖动?
Chapter 6 File Systems 6.1 Files 6.2 Directories
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Procedures. 2 Procedure Definition A procedure is a mechanism for abstracting a group of related operations into a single operation that can be used repeatedly.
Processes Management.
Essential Cell Biology
PSSA Preparation.
Chapter 11 Creating Framed Layouts Principles of Web Design, 4 th Edition.
 2003 Prentice Hall, Inc. All rights reserved. 1 Chapter 13 - Exception Handling Outline 13.1 Introduction 13.2 Exception-Handling Overview 13.3 Other.
The Linux Kernel: Memory Management
Introduction to Kernel
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
1 I/O Management in Representative Operating Systems.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
1 Lecture 20: I/O n I/O hardware n I/O structure n communication with controllers n device interrupts n device drivers n streams.
Chapter 4. INTERNAL REPRESENTATION OF FILES
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
Device Drivers CPU I/O Interface Device Driver DEVICECONTROL OPERATIONSDATA TRANSFER OPERATIONS Disk Seek to Sector, Track, Cyl. Seek Home Position.
Digital UNIX Internals II4 - 1Buffer Caches Chapter Four.
4P13 Week 12 Talking Points Device Drivers 1.Auto-configuration and initialization routines 2.Routines for servicing I/O requests (the top half)
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Device Driver Concepts Digital UNIX Internals II Device Driver Concepts Chapter 13.
ICOM Noack Linux I/O structure Device special files Device switch tables and fops How the kernel finds a device Parts of a device driver or module.
Introduction to Kernel
Module 12: I/O Systems I/O hardware Application I/O Interface
Chapter 11: File System Implementation
Chapter 12: File System Implementation
An overview of the kernel structure
Main Memory Background Swapping Contiguous Allocation Paging
I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Internal Representation of Files
Structure of Processes
Mr. M. D. Jamadar Assistant Professor
Presentation transcript:

1 UNIX Internals – The New Frontiers Device Drivers and I/O

Overview u Device driver u An object that controls one or more devices and interacts with the kernel u Written by third-party vendor u Isolate device-specific code in a module u Easy to add without kernel source code u Kernel has a consistent view of all devices

3 System Call Interface Device Driver Interface

4 Hardware Configuration u BUS: u ISA,EISA u MASBUS,UNIBUS u PCI u Two components u Controller or adapter u Connect one or more devices u A set of CSRs for each u Device:

5

6 Hardware Configuration(2) u I/O space u The set of all device registers u Frame buffer u Separate from main memory u Memory mapped I/O u Transferring method u PIO-Programmed I/O u Interrupt-driven I/O u DMA-Direct Memory Access

7 Device Interrupts u Each device interrupt has a fixed ipl. u Invoke a routine, u Save the register & raise the ipl to the system ipl u Calls the handler u Restore the ipl and the register u Spltty(): raise the ipl to that of the terminal u Splx(): lowers the ipl to a previously saved value u Identify the handler u Vectored: interrupt vector number & interrupt vector table u Polled: many handlers share one number u Short & Quick

Device Driver Framework u Classifying Devices and Drivers u Block u In fixed size, randomly accessed block u Hard disk, floppy disk, CD-ROM u Character u Arbitrary-sized data u One byte at a time, interrupt u Terminals, printers, the mouse, and sound cards u Non-block: Time clock, memory mapped screen u Pseudodevice u Mem driver, null device, zero device

9 Invoking Driver Code u Invoke: u Configuration: initialize u Only once u I/O: read or write data(sync) u Control: control requests(sync) u Interrupts: (asynchronous)

10 Parts of a device driver u Two parts: u Top half:synchronous routines, execute in process context. They may access the address space and the u area of the calling process and may put the process to sleep if necessary u Bottom half: asynchronous routines run in system context and usually have no relation to the currently running process. They are not allowed to access the current user address space or the u area. They are not allowed to sleep, since that may block an unrelated process. u The two halves need to synchronize their activities. If an object is accessed by both halves, then the top-half routines must block interrupts while manipulating it. Otherwise the device may interrupt while the object is in an inconsistant state, with unpredictable results.

11 The Device Switches u A data structure that defines the entry points each device must support. bdevsw{ int(* d_open ) (); int(* d_close) (); int(* d_strategy) (); int(* d_size) (); int(* d_xhalt) (); …… } bdevsw[]: cdevsw{ int(* d_open)(): int(* d_close)(): int(* d_read)(): int(* d_write)(): int(* d_ioctl)(): int(* d_mmap)(): int(* d_segmap)(): int(* d_xpoll)(): int(* d_xhalt)(): struct streamtab* d_str: } cdevsw[]

12 Driver Entry Points d_open(): d_close(): d_strategy():r/w for block device d_size(): determine the size of a disk partition d_read(): from character device d_write(): to character device d_ioctl(): for a character device define a set of cmds d_segmap(): map the device memory to the process address space d_mmap(): d_xpoll(): to check d_xhalt():

The I/O Subsystem u A portion of the kernel that controls the device-independent part of I/O u Major and Minor Numbers u Major number: u Device type u Minor number: u Device instance u *bdevsw[getmajor(dev)].d_open()(dev,…) u dev_t: u Earlier: 16b, 8 for major and minor u SVR4: 32b, 14 for major, 18 for minor

14 Device Files u A specified file located in the file system and associated with a specific device. u Users can use the device file as ordinary inode u di_mode: IFBLK, IFCHR u di_rdev: u mknod(path, mode, dev) u Create a device file u Access control & protection u r/w/e for o, g and others

15 The specfs File System u A special file system type u specfs vnode u All operations to the file are routed to it u snode u E.g:/dev/lp u ufs_lookup()->vnode of dev->vnode of lp ->the file type=IFCHR-> -> specvp()->search the snode hash table by u No, create snode and vnode: stores the pointer to the vnode of /dev/lp to the s_realvp u Returns the pointer to the specfs vnode to ufs_lookup(), to open()

16 Data structures

17 The Common snode u More device files then the number of real devices u Many closing u If many opened, the kernel should recognize the situation and call the device close operation only after both files are closed u Page addressing u Many pages represents one device, maybe inconsistent

18

19 Device cloning u When a user does not care what instance of a device is used, e.g. for network access, u Multiple active connections can be created, each with a different minor dev. number u Cloning is supported by dedicated clone drivers with major dev. # = # of the clone device, minor dev. # = major dev. # of the real device u E.g. clone driver # = 63 (major #), TCP driver major # = 31, /dev/tcp major # = 63, minor # = 31; tcpopen() generates an unused minor device #

20 I/O to a Character Device u Open: u Creates an snode, a common snode & file u Read: u File, the vnode, validation, VOP_READ, spec_read()>checks the vnode type, looks up the cdevsw[] indexed by the in v_rdev, d_read()>uio as the read parameter, uiomove()>copy data

The poll System call u Multiplex I/O over several descriptors u An fd for each connection, read on an fd, and block u Read any? u poll(fds, nfds, timeout): u timeout: 0,-1, INFTIME u struct pollfd{ u int fd: u short events: u short revents: u } u Events u POLLIN, POLLOUT, POLLERR, POLLHUP An array[nfds] of struct pollfd A bit mask

22 poll Implementation u Structures u pollhead: with a device file, maintains a queue of polldat u polldat: u a blocked process(proc ) u the events u link

23 Poll

24 VOP_POLL u Error = VOP_POLL(vp, events, anyyet, &revents, &php) u spec_poll() indexes cdevsw[] > d_xpoll()>checks events?updates revent, returns: anyyet=0?return a pointer to the pollhead u Returns to poll()> check revents & anyyet u Both = 0? Get the pollhead php, allocates a polldat, adds it to the queue, pointer to a proc, mask the events, link to another, block : !=0 in revents, removes all the polldat from the queue, free, anyyet+=number u Block, maintain the events in the driver, when occurs, pollwakeup(), event& the php

Block I/O u Formatted u Access by files u Unformatted u Access directly by device file u Block I/O: u r/w file u r/w device file u Accessing memory mapped to a file u Paging to/from a swap device

26 Block device read

27 The buf Structure u The only interface btwn kernel & the block device driver u u Starting block number u Byte number: sectors u Location in memory u Flags: r/w, sync/async u Address of completion routine u Completion status u Flags u Error code u Residual byte count

28 Buffer cache u Administrative info for a cached blk u A pointer to the vnode of the device file u Flags that specify if the buffer free u The aged flag u Pointers on an LRU freelist u Pointers in a hash queue

29 Interaction with the Vnode u Address a disk block by specifying a vnode, and an offset in that vnode u The device vnode and the physical offset u Only when the fs is not mounted u Ordinary file u The file vnode and the logical offset u VOP_GETPAGE>(ufs)spec_getpage() u Checks in memory, ufs_bmap()->pblk,alloc the page, and buf, d_strategy() >read,wakes up u VOP_PUTPAGE>(ufs)spec_putpage()

30 Device Access Methods u Pageout Operations u Vnode, VOP_PUTPAGE u spec_putpage(), d_strategy() u ufs_putpage(), ufs_bmap() u Mapped I/O to a File u exec: page fault, segvn_fault(), VOP_GETPAGE u Ordinary File I/O u ufs_read: segmap_getmap(), uiomove(), segmap_release() u Direct I/O to Block Device u spec_read: segmap_getmap(), uiomove(), segmap_release()

31 Raw I/O to a Block Device u Copy the data twice u From the user space – to the kernel u From the kernel –to the disk u Caching is beneficial u But no for large data transfer u Mmap u Raw I/O: unbuffered access u d_read() or d_write() u physiock() u Validates u Allocate a buf u as_fault() u locks u d_strategy() u Sleeps u Unlock u returns

The DDI/DKI Specification u DDI/DKI:Device-Driver Interface & Device- Kernel Interface u 5 sections: u S1:data definition u S2: driver entry point routines u S3: kernel routines u S4: kernel data structures u S5: kernel #define statements u 3 parts: u Driver-kernel: the driver entry points and the kernel support routines u Driver-hardware: machine-dependent u Driver-boot:incorporate a driver into the kernel

33 General Recommendation u Should not directly access system data structure. u Only access the fields described in S4 u Should not define arrays of the structures defined in S4 u Should only set or clear flags for masks and never assign directly to the field u Some structures opaque can be accessed by the routines u Use the functions in S3 to read or modify the structures in S4 u Include ddi.h u Declare any private routines or global variables as static

34 Section 3 Functions u Synchronization and timing u Memory management u Buffer management u Device number operations u Direct memory access u Data transfers u Device polling u STREAMS u Utility routines

35

36 Other sections u S1: specify prefix, prefixdevflag, disk -> dk u D_DMA u D_TAPE u D_NOBRKUP u S2: u specify the driver entry points u S4: u describes data structures shared by the kernel and the devices u S5: u The relevant kernel #define values

Newer SVR4 Releases u MP-Safe Drivers u Protect most global data by using multiprocessor synchronization primitives. u SVR4/MP u Adds a set of functions that allow drivers to use its new synchronization facilities. u Three locks: basic, read/write and sleep locks u Adds functions to allocate and manipulate the difference synchronization u Adds a D_MP flag to the prefixdevflag of the driver.

38 Dynamic Loading & Unloading u SVR4.2 supports dynamic operation for: u Device drivers u Host bus adapter and controller drivers u STREAMS modules u File systems u Miscellaneous modules u Dynamic Loading: u Relocation and binding of the drivers symbols. u Driver and device initialization u Adding the driver to the device switch tables, so that the kernel can access the switch routines u Installing the interrupt handler

39 SVR4.2 routines u prefix_load() u prefix_unload() u mod_drvattach() u mod_drvdetach() u Wrapper Macros u MOD_DRV _WRAPPER u MOD_HDRV_WRAPPER u MOD_STR_WRAPPER u MOD_FS_WRAPPER u MOD_MISC_WRAPPER

40 Future directions u Divide the code into a device-dependent and a controller-dependent part u PDI standard u A set of S2 functions that each host bus adapter must implement u A set of S3 functions that perform common tasks required by SCSI devices u A set of S4 data structures that are used in S3 functions

41 Linux I/O u Elevator scheduler u Maintains a single queue for disk read and write requests u Keeps list of requests sorted by block number u Drive moves in a single direction to satisfy each request

42 Linux I/O u Deadline scheduler u Uses three queues u Each incoming request is placed in the sorted elevator queue u Read requests go to the tail of a read FIFO queue u Write requests go to the tail of a write FIFO queue u Each request has an expiration time

43 Linux I/O

44 Linux I/O u Anticipatory I/O scheduler (in Linux 2.6): u Delay a short period of time after satisfying a read request to see if a new nearby request can be made (principle of locality) – to increase performance. u Superimposed on the deadline scheduler u Request is first dispatched to anticipatory scheduler – if there is no other read request within the time delay then the deadline scheduling is used.

45 Linux page cache (in Linux 2.4 and later) u Single unified page cache involved in all traffic between disk and main memory u Benefits – when it is time to write back dirty pages to disk, a collection of them can be ordered properly and written out efficiently; - pages in the page cache are likely to be referenced again before they are flushed from the cache, thus saving a disk I/O operation.