Presentation on theme: "Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141."— Presentation transcript:
Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141
Administrative MP3 – Deadline, November 5, 8am Re-grading of MP2, HW1, Midterm is closed 6/3/20142
3 Data Centers Core of enterprise computing – A cluster of specialized servers – Multiple tiers Storage Database File Applications Email Web
6/3/2014 4 Data Path Disks, storage server, and database server Widely used caching memory – Large access speed gaps – Different sizes – Various granularity >300 cycles ~125 ns 14 cycles 6 ns ~40 us > 5 ms
6/3/2014 5 Two Performance Trends The gaps are increasing large Source: Zhifeng Chen, Optimization of Data Access for Database Applications, PhD Thesis, 2005, UIUC
6/3/20146 Disks First, Then File Systems Form factor:.5-1 4 5.7 Storage: 18-73GB Form factor:.4-.7 2.7 3.9 Storage: 4-27GB Form factor:.2-.4 2.1 3.4 Storage: 170MB-1GB
6/3/20147 Disk Technology Trends Disks are getting smaller for similar capacity – Spin faster, less rotational delay, higher bandwidth – Less distance for head to travel (faster seeks) – Lighter weight (for portables) Disk data is getting denser – More bits/square inch – Tracks are closer together – Doubles density every 18 months Disks are getting cheaper ($/MB) – Factor of ~2 per year since 1991 – Head close to surface
6/3/20148 Disk Organization Disk surface – Circular disk coated with magnetic material Tracks – Concentric rings around disk surface, bits laid out serially along each track Sectors – Each track is split into arc of track (min unit of transfer) sector
6/3/20149 More on Disks CDs and floppies come individually, but magnetic disks come organized in a disk pack Cylinder – Certain track of the platter Disk arm – Seek the right cylinder seek a cylinder
6/3/201411 Disk Performance Seek – Position heads over cylinder, typically 5.3 8 ms Rotational delay – Wait for a sector to rotate underneath the heads – Typically 8.3 6.0 ms (7,200 – 10,000RPM) or ½ rotation takes 4.15- 3ms Transfer bytes – Average transfer bandwidth (15-37 Mbytes/sec) Performance of transfer 1 Kbytes – Seek (5.3 ms) + half rotational delay (3ms) + transfer (0.04 ms) – Total time is 8.34ms or 120 Kbytes/sec! What block size can get 90% of the disk transfer bandwidth?
6/3/2014 12 Disk Behaviors There are more sectors on outer tracks than inner tracks – Read outer tracks: 37.4MB/sec – Read inner tracks: 22MB/sec Seek time and rotational latency dominate the cost of small reads – A lot of disk transfer bandwidth is wasted – Need algorithms to reduce seek time Block Size (Kbytes) % of Disk Transfer Bandwidth 1Kbytes0.5% 8Kbytes3.7% 256Kbyte s 55% 1Mbytes83% 2Mbytes90%
6/3/201413 Observations Getting first byte from disk read is slow – high latency Peak bandwidth high, but rarely achieved Need to mitigate disk performance impact – Do extra calculations to speed up disk access Schedule requests to shorten seeks – Move some disk data into main memory – file system caching
6/3/201414 RAID Use parallel processing to speed up CPU performance Use parallel I/O to improve disk performance, reliability (1988, Patterson) Design new class of I/O devices called RAID – Redundant Array of Inexpensive Disks (also Redundant Array of Independent Disks) Use the RAID in OS as a SLED (Single Large Expensive Disk), but with better performance and reliability
6/3/201415 RAID (cont.) RAID consists of RAID SCSI controller plus a box of SCSI disks Data are divided into strips and distributed over disks for parallel operation RAID 0 … RAID 5 levels RAID 0 organization writes consecutive strips over the drives in round-robin fashion – operation is called striping RAID 1 organization uses striping and duplicates all disks RAID 2 uses words, even bytes and stripes across multiple disks; uses error codes, hence very robust scheme RAID 3, 4, 5 alterations of the previous ones
Kernel Components Affected by Block Device Op. 6/3/201417 Block Device Driver Block Device Driver I/O Scheduler Layer Generic Block Layer Disk Filesystem Disk Filesystem Disk Filesystem Mapping Layer Disk Caches VFS
6/3/201418 Disk Scheduling Which disk request is serviced first? – FCFS – Shortest seek time first – Elevator (SCAN) – LOOK – C-SCAN (Circular SCAN) – C-LOOK Looks familiar?
6/3/201419 FIFO (FCFS) order Method – First come first serve Pros – Fairness among requests – In the order applications expect Cons – Arrival may be on random spots on the disk (long seeks) – Wild swing can happen Analogy: – Can elevator scheduling use FCFS? 0199 98, 183, 37, 122, 14, 124, 65, 67 53
6/3/201420 SSTF ( Shortest Seek Time First ) Method – Pick the one closest on disk – Rotational delay is in calculation Pros – Try to minimize seek time Cons – Starvation Question – Is SSTF optimal? – Can we avoid starvation? Analogy: elevator 0199 98, 183, 37, 122, 14, 124, 65, 67 (65, 67, 37, 14, 98, 122, 124, 183) 53
6/3/201421 Elevator (SCAN) Method – Take the closest request in the direction of travel – Real implementations do not go to the end (called LOOK) Pros – Bounded time for each request Cons – Request at the other end will take a while 0199 98, 183, 37, 122, 14, 124, 65, 67 (37, 14, 0, 65, 67, 98, 122, 124, 183) 53
6/3/201422 C-SCAN (Circular SCAN) Method – Like SCAN – But, wrap around – Real implementation doesnt go to the end (C-LOOK) Pros – Uniform service time Cons – Do nothing on the return 0199 98, 183, 37, 122, 14, 124, 65, 67 (65, 67, 98, 122, 124, 183, 199, 0, 14, 37) 53
6/3/201423 LOOK and C-LOOK SCAN and C-SCAN move the disk arm across the full width of the disk In practice, neither algorithm is implemented this way More commonly, the arm goes only as far as the final request in each direction. Then, it reverses direction immediately, without first going all the way to the end of the disk. These versions of SCAN and C-SCAN are called LOOK and C-LOOK
6/3/201424 Group Discussion Questions The disk scheduling algorithm that may cause starvation is: – FCFS or SSTF or C-SCAN or LOOK ?? From the list of disk-scheduling algorithms (FCFS, SSTF, SCAN, C- SCAN, LOOK, C-LOOK), SSTF will always give the least head movement for any set of cylinder-number requests to the disk scheduler: – True or False ?? The cylinder numbers on a disk are 0,1,…10. Currently, there are five cylinder requests on the disk scheduler queue in the following order: 1,5,4,8,7 and the head is located at position 2 and moving in the direction of increasing block numbers. The time to serve a request is proportional to the distance from the head to the cylinder number requested. If T(X) is the time it takes to service the requests currently in the queue using scheduling algorithm X, then: – T(SSTF) < T(SCAN) < T(FCFS) or – T(FCFS) < T(SSTF) < T(SCAN) or – T(SSTF) < T(FCFS) < T(SCAN) or – None of the above???
6/3/201425 History of Disk-related Concerns When memory was expensive – Do as little bookkeeping as possible When disks were expensive – Get every last sector of usable space When disks became more common – Make them much more reliable When processor got much faster – Make them appear faster
6/3/201426 Disk Versus Memory Memory Latency in 10s of processor cycles Transfer rate 300+MB/s Contiguous allocation gains ~10x Disk Latency in milliseconds Transfer rate 5-50MB/s Contiguous allocation gains ~1000x
6/3/201427 On-Disk Caching Method – Put RAM on disk controller to cache blocks Seagate ATA disk has.5MB, IBM Ultra160 SCSI has 16MB Some of the RAM space stores firmware (an OS) – Blocks are replaced usually in LRU order Pros – Good for reads if you have locality Cons – Expensive – Need to deal with reliable writes
6/3/201428 Disk Block Caches Main memory rather than disk may hold disk blocks 85% or more of all I/O requests by file system and applications can be satisfied by disk block cache BSD UNIX provides a disk block cache as part of the block-oriented device software layer It consists of between 100 and 1000 individual buffers.
6/3/201429 CD-ROM Compact Disk – Read Only Memory, Optical Disk – 1980 – Philips and Sony developed CD CD is prepared (WRITE OPERATION) – using a high-power infrared laser to burn 0.8 micron diameter holes in a coated glass master disk; – from the master disk, mold is created, processed and reflective layer is deposited on polycarbonate; – depressions on the polycarbonate substrate are called pits, the unburned areas between pits are called lands CD is read (READ OPERATION) – Low-power laser diode shines infrared light with a wavelength of 0.78 micron on pits and lands as they stream by. – Laser is on the polycarbonate side, so pits stick out toward the laser as bumps in the flat surface. – Pits and lands return different light to the players photodetector (pit returns less light than light bouncing off land), hence the player tells pit from land. Pit length is 0.6 micrometers. – Pit/land and land/pit transitions represent 1, absence of transition is 0. CD-Recordable, CD-Rewritables, DVD
6/3/201430 Disk I/O Summary Disk is an important I/O device Disk must be fast and reliable Error Handling is important; check checksum, called ECC (Error Correcting Code) – Track a bad sector – Substitute a spare for the bad sector – Shift al sectors to bypass the bad one RAID protects against few bad sectors, but does not protect against write errors laying down bad data Stable Storage may be needed