Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 UNIX Internals – The New Frontiers Device Drivers and I/O.

Similar presentations


Presentation on theme: "1 UNIX Internals – The New Frontiers Device Drivers and I/O."— Presentation transcript:

1 1 UNIX Internals – The New Frontiers Device Drivers and I/O

2 Overview u Device driver u An object that controls one or more devices and interacts with the kernel u Written by third-party vendor u Isolate device-specific code in a module u Easy to add without kernel source code u Kernel has a consistent view of all devices

3 3 System Call Interface Device Driver Interface

4 4 Hardware Configuration u BUS: u ISA,EISA u MASBUS,UNIBUS u PCI u Two components u Controller or adapter u Connect one or more devices u A set of CSRs for each u Device:

5 5

6 6 Hardware Configuration(2) u I/O space u The set of all device registers u Frame buffer u Separate from main memory u Memory mapped I/O u Transferring method u PIO-Programmed I/O u Interrupt-driven I/O u DMA-Direct Memory Access

7 7 Device Interrupts u Each device interrupt has a fixed ipl. u Invoke a routine, u Save the register & raise the ipl to the system ipl u Calls the handler u Restore the ipl and the register u Spltty(): raise the ipl to that of the terminal u Splx(): lowers the ipl to a previously saved value u Identify the handler u Vectored: interrupt vector number & interrupt vector table u Polled: many handlers share one number u Short & Quick

8 Device Driver Framework u Classifying Devices and Drivers u Block u In fixed size, randomly accessed block u Hard disk, floppy disk, CD-ROM u Character u Arbitrary-sized data u One byte at a time, interrupt u Terminals, printers, the mouse, and sound cards u Non-block: Time clock, memory mapped screen u Pseudodevice u Mem driver, null device, zero device

9 9 Invoking Driver Code u Invoke: u Configuration: initialize u Only once u I/O: read or write data(sync) u Control: control requests(sync) u Interrupts: (asynchronous)

10 10 Parts of a device driver u Two parts: u Top half:synchronous routines, execute in process context. They may access the address space and the u area of the calling process and may put the process to sleep if necessary u Bottom half: asynchronous routines run in system context and usually have no relation to the currently running process. They are not allowed to access the current user address space or the u area. They are not allowed to sleep, since that may block an unrelated process. u The two halves need to synchronize their activities. If an object is accessed by both halves, then the top-half routines must block interrupts while manipulating it. Otherwise the device may interrupt while the object is in an inconsistant state, with unpredictable results.

11 11 The Device Switches u A data structure that defines the entry points each device must support. bdevsw{ int(* d_open ) (); int(* d_close) (); int(* d_strategy) (); int(* d_size) (); int(* d_xhalt) (); …… } bdevsw[]: cdevsw{ int(* d_open)(): int(* d_close)(): int(* d_read)(): int(* d_write)(): int(* d_ioctl)(): int(* d_mmap)(): int(* d_segmap)(): int(* d_xpoll)(): int(* d_xhalt)(): struct streamtab* d_str: } cdevsw[]

12 12 Driver Entry Points d_open(): d_close(): d_strategy():r/w for block device d_size(): determine the size of a disk partition d_read(): from character device d_write(): to character device d_ioctl(): for a character device define a set of cmds d_segmap(): map the device memory to the process address space d_mmap(): d_xpoll(): to check d_xhalt():

13 The I/O Subsystem u A portion of the kernel that controls the device-independent part of I/O u Major and Minor Numbers u Major number: u Device type u Minor number: u Device instance u *bdevsw[getmajor(dev)].d_open()(dev,…) u dev_t: u Earlier: 16b, 8 for major and minor u SVR4: 32b, 14 for major, 18 for minor

14 14 Device Files u A specified file located in the file system and associated with a specific device. u Users can use the device file as ordinary inode u di_mode: IFBLK, IFCHR u di_rdev: u mknod(path, mode, dev) u Create a device file u Access control & protection u r/w/e for o, g and others

15 15 The specfs File System u A special file system type u specfs vnode u All operations to the file are routed to it u snode u E.g:/dev/lp u ufs_lookup()->vnode of dev->vnode of lp ->the file type=IFCHR-> -> specvp()->search the snode hash table by u No, create snode and vnode: stores the pointer to the vnode of /dev/lp to the s_realvp u Returns the pointer to the specfs vnode to ufs_lookup(), to open()

16 16 Data structures

17 17 The Common snode u More device files then the number of real devices u Many closing u If many opened, the kernel should recognize the situation and call the device close operation only after both files are closed u Page addressing u Many pages represents one device, maybe inconsistent

18 18

19 19 Device cloning u When a user does not care what instance of a device is used, e.g. for network access, u Multiple active connections can be created, each with a different minor dev. number u Cloning is supported by dedicated clone drivers with major dev. # = # of the clone device, minor dev. # = major dev. # of the real device u E.g. clone driver # = 63 (major #), TCP driver major # = 31, /dev/tcp major # = 63, minor # = 31; tcpopen() generates an unused minor device #

20 20 I/O to a Character Device u Open: u Creates an snode, a common snode & file u Read: u File, the vnode, validation, VOP_READ, spec_read()>checks the vnode type, looks up the cdevsw[] indexed by the in v_rdev, d_read()>uio as the read parameter, uiomove()>copy data

21 The poll System call u Multiplex I/O over several descriptors u An fd for each connection, read on an fd, and block u Read any? u poll(fds, nfds, timeout): u timeout: 0,-1, INFTIME u struct pollfd{ u int fd: u short events: u short revents: u } u Events u POLLIN, POLLOUT, POLLERR, POLLHUP An array[nfds] of struct pollfd A bit mask

22 22 poll Implementation u Structures u pollhead: with a device file, maintains a queue of polldat u polldat: u a blocked process(proc ) u the events u link

23 23 Poll

24 24 VOP_POLL u Error = VOP_POLL(vp, events, anyyet, &revents, &php) u spec_poll() indexes cdevsw[] > d_xpoll()>checks events?updates revent, returns: anyyet=0?return a pointer to the pollhead u Returns to poll()> check revents & anyyet u Both = 0? Get the pollhead php, allocates a polldat, adds it to the queue, pointer to a proc, mask the events, link to another, block : !=0 in revents, removes all the polldat from the queue, free, anyyet+=number u Block, maintain the events in the driver, when occurs, pollwakeup(), event& the php

25 Block I/O u Formatted u Access by files u Unformatted u Access directly by device file u Block I/O: u r/w file u r/w device file u Accessing memory mapped to a file u Paging to/from a swap device

26 26 Block device read

27 27 The buf Structure u The only interface btwn kernel & the block device driver u u Starting block number u Byte number: sectors u Location in memory u Flags: r/w, sync/async u Address of completion routine u Completion status u Flags u Error code u Residual byte count

28 28 Buffer cache u Administrative info for a cached blk u A pointer to the vnode of the device file u Flags that specify if the buffer free u The aged flag u Pointers on an LRU freelist u Pointers in a hash queue

29 29 Interaction with the Vnode u Address a disk block by specifying a vnode, and an offset in that vnode u The device vnode and the physical offset u Only when the fs is not mounted u Ordinary file u The file vnode and the logical offset u VOP_GETPAGE>(ufs)spec_getpage() u Checks in memory, ufs_bmap()->pblk,alloc the page, and buf, d_strategy() >read,wakes up u VOP_PUTPAGE>(ufs)spec_putpage()

30 30 Device Access Methods u Pageout Operations u Vnode, VOP_PUTPAGE u spec_putpage(), d_strategy() u ufs_putpage(), ufs_bmap() u Mapped I/O to a File u exec: page fault, segvn_fault(), VOP_GETPAGE u Ordinary File I/O u ufs_read: segmap_getmap(), uiomove(), segmap_release() u Direct I/O to Block Device u spec_read: segmap_getmap(), uiomove(), segmap_release()

31 31 Raw I/O to a Block Device u Copy the data twice u From the user space – to the kernel u From the kernel –to the disk u Caching is beneficial u But no for large data transfer u Mmap u Raw I/O: unbuffered access u d_read() or d_write() u physiock() u Validates u Allocate a buf u as_fault() u locks u d_strategy() u Sleeps u Unlock u returns

32 The DDI/DKI Specification u DDI/DKI:Device-Driver Interface & Device- Kernel Interface u 5 sections: u S1:data definition u S2: driver entry point routines u S3: kernel routines u S4: kernel data structures u S5: kernel #define statements u 3 parts: u Driver-kernel: the driver entry points and the kernel support routines u Driver-hardware: machine-dependent u Driver-boot:incorporate a driver into the kernel

33 33 General Recommendation u Should not directly access system data structure. u Only access the fields described in S4 u Should not define arrays of the structures defined in S4 u Should only set or clear flags for masks and never assign directly to the field u Some structures opaque can be accessed by the routines u Use the functions in S3 to read or modify the structures in S4 u Include ddi.h u Declare any private routines or global variables as static

34 34 Section 3 Functions u Synchronization and timing u Memory management u Buffer management u Device number operations u Direct memory access u Data transfers u Device polling u STREAMS u Utility routines

35 35

36 36 Other sections u S1: specify prefix, prefixdevflag, disk -> dk u D_DMA u D_TAPE u D_NOBRKUP u S2: u specify the driver entry points u S4: u describes data structures shared by the kernel and the devices u S5: u The relevant kernel #define values

37 Newer SVR4 Releases u MP-Safe Drivers u Protect most global data by using multiprocessor synchronization primitives. u SVR4/MP u Adds a set of functions that allow drivers to use its new synchronization facilities. u Three locks: basic, read/write and sleep locks u Adds functions to allocate and manipulate the difference synchronization u Adds a D_MP flag to the prefixdevflag of the driver.

38 38 Dynamic Loading & Unloading u SVR4.2 supports dynamic operation for: u Device drivers u Host bus adapter and controller drivers u STREAMS modules u File systems u Miscellaneous modules u Dynamic Loading: u Relocation and binding of the drivers symbols. u Driver and device initialization u Adding the driver to the device switch tables, so that the kernel can access the switch routines u Installing the interrupt handler

39 39 SVR4.2 routines u prefix_load() u prefix_unload() u mod_drvattach() u mod_drvdetach() u Wrapper Macros u MOD_DRV _WRAPPER u MOD_HDRV_WRAPPER u MOD_STR_WRAPPER u MOD_FS_WRAPPER u MOD_MISC_WRAPPER

40 40 Future directions u Divide the code into a device-dependent and a controller-dependent part u PDI standard u A set of S2 functions that each host bus adapter must implement u A set of S3 functions that perform common tasks required by SCSI devices u A set of S4 data structures that are used in S3 functions

41 41 Linux I/O u Elevator scheduler u Maintains a single queue for disk read and write requests u Keeps list of requests sorted by block number u Drive moves in a single direction to satisfy each request

42 42 Linux I/O u Deadline scheduler u Uses three queues u Each incoming request is placed in the sorted elevator queue u Read requests go to the tail of a read FIFO queue u Write requests go to the tail of a write FIFO queue u Each request has an expiration time

43 43 Linux I/O

44 44 Linux I/O u Anticipatory I/O scheduler (in Linux 2.6): u Delay a short period of time after satisfying a read request to see if a new nearby request can be made (principle of locality) – to increase performance. u Superimposed on the deadline scheduler u Request is first dispatched to anticipatory scheduler – if there is no other read request within the time delay then the deadline scheduling is used.

45 45 Linux page cache (in Linux 2.4 and later) u Single unified page cache involved in all traffic between disk and main memory u Benefits – when it is time to write back dirty pages to disk, a collection of them can be ordered properly and written out efficiently; - pages in the page cache are likely to be referenced again before they are flushed from the cache, thus saving a disk I/O operation.


Download ppt "1 UNIX Internals – The New Frontiers Device Drivers and I/O."

Similar presentations


Ads by Google