CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #6.

CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Email: chen@cs.tamu.edu Notes #6

Optimizing Disk Access

(By reducing the seek time and rotational delay via Operating Systems and Disk Controllers)

Optimizing Disk Access (By reducing the seek time and rotational delay via Operating Systems and Disk Controllers) 3 or 5x x 1N Cylinders Traveled Time Average seek time = 20 ms Shortest seek time = 5 ms Average rotational delay = 8 ms

Dealing with many random accesses (disk scheduling) Suppose that we have a large (dynamic) sequence of disk read/write tasks, on blocks randomly distributed in the disk. How do we order the tasks so that the total time can be minimized? Elevator algorithm

Disk head makes sweeps across the disk, stops at a cylinder if a task reads/writes a block in the cylinder, and reverses its direction if no read/write tasks (at that moment) ahead. Intuitively good, in particular when there are a large number of tasks reading/writing blocks uniformly distributed in the disk Real-time manner Precise analysis is difficult

Dealing with a long sequence of data on disk Data in consecutive cylinders Larger buffer Pre-fetch/double buffering Disk arrays Mirrored disks

Example. Sorting on disk (again) A relation R of 10M tuples takes 100K blocks Main memory can store 6400 blocks A disk block read/write: 40 ms seek time = 31 ms, rotational delay = 8 ms transfer time = 1 ms Two-phase Multiway Sorting on randomly distributed blocks takes about 4.5 hours. Also assume that a track holds 500 blocks, and that traversing one cylinder takes 5 ms

Data in consecutive cylinders In phase 1, suppose that we have the input relation stored in consecutive tracks We can read/write 6400 consecutive blocks between main memory and disks Phase 1 read/write: 2*(100K/6400)(31 + 8 + 12*5 + 6400*1) ≈ 208000 ms < 4 minutes (save 2 hour)

Larger Buffer In phase 2, we have 16 sublists, each takes a block in main memory, with 6384 blocks left. If we use all these 6384 blocks for output buffer, and write them to disk only when they are all full: Phase 2 writing: (100K/6384)(31 + 8 + 12*5 + 6384*1) = 104000 ms < 2 minutes (save 1 hour)

However The reading in phase 2 seems harder to improve: it is kind of random.

Double Buffering For applications where the read/write is predictable. Have a program » Process B1 » Process B2 » Process B3...

Single Buffer Solution 1.Read B1  Buffer 2.Process Data in Buffer 3.Read B2  Buffer 4.Process Data in Buffer... Let P = time to process/block R = time to read in 1 block n = # blocks Single buffer time = n(P+R)

Double Buffering Memory: Disk: ABCDGEF

Double Buffering Memory: Disk: ABCDGEFA

Double Buffering Memory: Disk: ABCDGEF B done process A

Double Buffering Memory: Disk: ABCDGEF C process B done

if P  R Double buffering time = R + nP Single buffering time = n(R+P) P = Processing time/block R = IO time/block n = # blocks

Disk Arrays Taking the advantage that disk read/write can be done in parallel between a single CPU and multiple disks. logically one disk Would not help if the interesting blocks are in the same disk

Mirrored Disks Duplicating disks so that multiple reads in the same disk can be done in parallel. A A B B Writing is more (but not much) expensive

Disk Failures Partial  Total Intermittent  Permanent

Coping with Disk Failures Detection –Checksum Correction –Redundancy

At what level do we cope? Operating System Level (Stable Storage) Logical block Copy A Copy B Database System Level (Log File) Log Current DB Yesterday’s DB

Intermittent Failure Detection (Checksums) Idea: add n parity bits every m data bits –Ex.: m=8, n=1 Block A: 01101000:1 (odd # of 1’s) Block B: 11101110:0(even # of 1’s) But suppose: Block A instead contains Block A’: 01000000:1 (also has odd # of 1’s)  50% change of detection per parity bit More parity bits decrease the probability of an undetected failure  1/2 n (with n ≤ m independent parity bits)

Disk Crash (Disk Arrays) RAIDs (Redundant Arrays of Inexpensive Drives) logically one disk

Disk Arrays RAID Level 1 (Mirroring) –Keep exact copy of data on redundant disks A A B B A A B B

Disk Arrays RAID Level 4 –Keep only one redundant disk –Entire parity blocks on redundant disk A A B B C C P P

Parity Blocks & Modulo-2 Sums Have an array of 3 data disks –Disk 1, block 1: 11110000 –Disk 2, block 1: 10101010 –Disk 3, block 1: 00111000 … and 1 parity disk –Disk 4, block 1: 01100010 Note: - Sum over each column is always an even # of 1’s - Mod-2 sum can recover any missing single row (e.g., a logical block)

Using Mod-2 Sums for Error Recovery –Suppose we have: –Disk 1, block 1: 11110000 –Disk 2, block 1: ???????? –Disk 3, block 1: 00111000 –Disk 4, block 1: 01100010 (  Parity) –Mod-2 sums for block 1 over disks 1,3,4:  Disk 2, block 1: 10101010

Disk Arrays RAID Level 5 (Striping) –Like level 4, but balanced read & write load D D C C B B A A  Parity partition on each disk

Disk Arrays RAID Level 6 (error correction code) more powerful, can recover from more than one task crashes.

CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #6.

Similar presentations

Presentation on theme: "CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #6."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #6.

Similar presentations

Presentation on theme: "CPSC-608 Database Systems Fall 2008 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #6."— Presentation transcript:

Similar presentations

About project

Feedback