Presentation on theme: "Data Storage John Ortiz. Lecture 17Data Storage2 Overview Database stores data on secondary storage Disk has distinct storage and access characteristics."— Presentation transcript:
Lecture 17Data Storage2 Overview Database stores data on secondary storage Disk has distinct storage and access characteristics DBMS must bring data from disk to main memory buffer for processing & write data from memory buffer to disk for storage Low level DBMS software must effectively manage disk and buffer space How disks are accessed? How to manage a buffer?
Lecture 17Data Storage3 Storage Hierarchy Architecture of a typical computer Processor: fast, slow, RISC, cache, pipelined. Typical speed: 100 500 1000 MIPS Main Memory: fast, slow, volatile, read-only Access time: 1 s – 1ns P M Secondary Storage C
Lecture 17Data Storage5 Disks DBMS stores information on (“hard”) disks. This has major implications for DBMS design! Data must be transferred from disk to main memory (for read) & vice versa (for write) Transfer unit: block (= 1 or more sectors) Why not store everything in main memory? Costs too much. $147 will buy you 128MB of RAM or 40GB of disk (Oct. 2000). Main memory is volatile. We want data to be saved between runs.
Lecture 17Data Storage6 Components of A Disk Platters Spindle Disk head Arm movement Arm assembly Tracks Sector Top view
Lecture 17Data Storage8 Components of A Disk (cont.) Division of tracks into sectors is hard coded on disk surface A sector is subdivided into one or more blocks Formatting sets the block size Interblock gaps make the difference between formatted and unformatted capacity
Lecture 17Data Storage9 block x in memory ? I want block X Disk Access Time Access Time = Seek Time + Rotational Delay + Transfer Time + Other
Lecture 17Data Storage10 Seek Time Time to move arms to position disk head on a given track 3 or 5x x 1N Cylinders Traveled Time
Lecture 17Data Storage11 Average Random Seek Time Typical S: 8 ms 40 ms N = number of tracks/surface (= # cylinders)
Lecture 17Data Storage12 Rotational Delay Time to wait for block to rotate under head also called rotational latency Average R = 1/2 revolution Typically R = 4.2 ms (7200 RPM) Complication: may need to wait for track start Head Here Start Track Block I Want
Lecture 17Data Storage13 Transfer Time Time for actually moving data to/from disk surface Typical transfer rate tr: 16 100 MB/s Block Transfer Time btt = block size / tr Typically, 4 KB: about 1 ms Other Delay CPU time to issue I/O Contention for controller Contention for bus, memory Typical value: 0
Lecture 17Data Storage14 Formulas tr = track size * rpm(transfer rate) rd = ½ * 1/rpm(rotational delay) btt = B/tr(block transfer time) B = Block Size btr = ( B/(B + G) * tr)(bulk transfer rate) G = interblock gap size Read ‘K’ consecutive blocks: s + rd + (B/btr) * K Read ‘K’ random blocks (s + rd + (B/btr) ) * K
Lecture 17Data Storage15 Formulas Log base N of X = log X/log N If no base indicated, assume base 10 Log 2 500 = log 500/log 2 bfr = floor(B/record size) File Size (in blocks) = ceiling (# records/bfr) (unspanned) = ceiling(# bytes in file/B) (spanned)
Lecture 17Data Storage16 Units!!! The units are just as important as the value Use the units to check your result Example: tr = track size * rpm Given track size = 32768 bytes and 3600 rpm, what is the transfer rate? 32768 * 3600 = 117,964,800 which SEEMS very unreasonable! However, it is 117,964,800 bytes per minute! 1,966,080 bytes per second, = 1.875 MB/sec Pretty Slow for a modern disk!
Lecture 17Data Storage17 How many bytes? 1 kilobyte = 1024 bytes (most of the time!) 1 megabyte = 1024 2 = 1,048,576 bytes 1 gigabyte = 1024 3 bytes 1 terabyte = 1024 4 bytes Disk manufacturers often use 1000 bytes as a kilobyte for marketing purposes! My 200 GB disk drive, is actually 186 GB, though it’s 200 billion bytes RAM marketing has so far not followed suit.
Lecture 17Data Storage18 What about Reading Next Block? If we do things right (double buffer, stagger blocks …) Time = T + Negligible Skip gap, change track, change cylinder Rule of Thumb: Random I/O: Expensive Sequential I/O: Much less EX: For 1 KB block Random I/O : about 20ms Sequential I/O: about 1ms
Lecture 17Data Storage19 What About Writing and Updating? Cost of writing is similar to reading, unless we need to verify Add full rotation + T Cost of modifying a block: (a) Read Block (b) Modify in Memory (c) Write Block [(d) Verify?] Block Address: Device, Cylinder #, Surface #, Sector
Lecture 17Data Storage20 Disk Example: IBM Ultrastar 36LP Formatted capacity: 36.9 GB Sector size: 512 to 528 variable (2-byte inc) Platters: 10 Max. recording density: 350000 BPI Track density: 18400 TPI (per surface) Rotation speed: 7200 RPM Average rotational delay: 4.17 ms Sustained data rate: 19.5-31.9 MB Seek time: average 6.8ms, next track 0.6ms
Lecture 17Data Storage21 Arranging Pages on Disk Blocks in a file should be arranged sequentially on disk (by next block), to minimize seek and rotational delay. The concept of next block : 1.Next block on the same track 2.1 st block on next track of the same cylinder 3.1 st block on 1 st track of the next cylinder For a sequential scan, pre-fetching several pages at a time is a big win!
Lecture 17Data Storage22 Disk Space Management Lowest layer of DBMS software manages space on disk. Higher levels call upon this layer to: allocate/de-allocate a page read/write a page One such “higher level” is the buffer manager, which receives a request to bring a page into memory and then, if needed, requests the disk space layer to read the page into the buffer pool.
Lecture 17Data Storage23 Buffer Management in a DBMS Data must be in RAM for DBMS to operate on it! Table of pairs is maintained. DB MAIN MEMORY DISK disk page free frame Page Requests from Higher Levels BUFFER POOL choice of frame dictated by replacement policy
Lecture 17Data Storage24 When a Page is Requested... If requested page is not in pool: Choose a frame for replacement If the frame is dirty, write it to disk Read requested page into chosen frame Else pin the frame (so it can not be replaced) Return its address. If requests can be predicted (e.g., sequential scans) pages can be pre-fetched several pages at a time!
Lecture 17Data Storage25 Buffer Replacement Policy A frame is chosen for replacement by a replacement policy: Least-recently-used (LRU), Clock, MRU, etc. Policy can have big impact on # of I/O’s; depends on the access pattern. Sequential flooding: Nasty situation caused by LRU + repeated sequential scans. # buffer frames < # pages in file means each page request causes an I/O. MRU much better in this situation (but not in all situations, of course).
Lecture 17Data Storage26 Representing Records A record is a collection of related data items (called fields) Fields of an Employee record: id, name, salary, date-of-hire,... Main choices: Format: fixed vs variable Length: fixed vs variable A schema (of record) specifies # of fields, type of each field, order in record, meaning of each field
Lecture 17Data Storage27 Example: Fixed Format and Length Employee record (1) Eid, 2 byte integer (2) Ename, 10 char. Schema (3) Dept, 2 byte code 55 s m i t h 02 83 j o n e s 01 Records
Lecture 17Data Storage28 File of Records Logically, a file is a sequence of records Physically, a file is a set of blocks R1R2R3R4 Rn assume fixed length blocks assume a single file (for now) a file
Lecture 17Data Storage29 Packing Records into Blocks Unspanned: records must be within one block Spanned: A record may be split between blocks R1R2R3R4R5 Block 1Block 2 … R1R2 R3 (a) R3 (b) R6R5R4 R7 (a) Block 1Block 2 … Unspanned: Simple, but may waste space Spanned: Necessary if record size > block size
Lecture 17Data Storage30 File Size: Spanned vs Unspanned Blocking Factor (bf F ): number of records per block. Ex: Block size B = 2048 bytes Number of records R = 200,000 Record size = 600 bytes (fixed length) Blocking factor = 2048/600 = 3 RPB (unspanned, 248 bytes unused) File size = 200000/3 = 66667 blocks (unspan.) = 200000 600/2048 = 58594 (span.)
Lecture 17Data Storage31 Summary Disks provide cheap, non-volatile storage. Random access, but cost depends on location of page on disk; important to arrange data sequentially to minimize seek and rotation delays. Buffer manager brings pages into RAM. Page stays in RAM until released by requestor(s).
Lecture 17Data Storage32 Summary (cont.) Written to disk when frame chosen for replacement (which is after all requestors release the page), or earlier. Choice of frame to replace based on replacement policy. File layer keeps track of pages in a file, and supports abstraction of a collection of records.
Lecture 17Data Storage33 Look Ahead Next topic: Hashing and Indexing Read textbook: Chapter 6.1-6.3