Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #5.

Similar presentations


Presentation on theme: "CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #5."— Presentation transcript:

1 CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Email: chen@cs.tamu.edu Notes #5

2 Computer Memory Hierarchy

3 CPU main memory Disk controller Secondary Storage... A Typical Computer bus disks

4 Main Memory fast small capacity (gigabytes) volatile Disks slow large capacity (100’s gigabytes) non-volatile

5 Typical Disk Terms: Platter, Head, Actuator, Cylinder, Track, Sector, Gap …

6 Top View Track Sector Gap

7 A “typical” disk 5 platters (thus 10 surfaces) A surface has 20,000 tracks A track has 500 sectors (million bytes) A sector has several thousand bytes Disk makes 5000 revolutions per minute (so about 10 millisecond per rotation)

8 Blocks A (logic) block = one or several sectors Block address Physical device Cylinder # Surface # Sector

9 Disk Access Time block X in memory ? I want block X Time = Seek Time + Rotational Delay + Transfer Time + Other

10 Seek Time 3 or 5x x 1N Cylinders Traveled Time

11 Average Random Seek Time   SEEKTIME (i  j) S = N(N-1) N N i=1 j=1 j  i typical seek time: 10 ms  40 ms

12 Rotational Delay Head here Block I want Average Rotational Delay R = 1/2 revolution typical rotational delay = 8 ms

13 Transfer Rate: typical: t = 80 MB/second = 80 KB/millisecond transfer time: block size / t

14 Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory Typical value: ≈ 0

15 Thus, reading a block of 16K bytes: Time = Seek Time + Rotational Delay + Transfer Time + Other ~ 30 ms + 8 ms + 16/80 ms + 0 ~ 40 ms

16 Main Memory fast (read/write: 10-100 nanosecond) small capacity (gigabytes) volatile Disks slow (read/write: 1~40 millisecond) large capacity (100’s gigabytes) non-volatile Disks are about 10 5 ~10 6 times slower than main memory

17 I/O Model of Computation Dominance of I/O cost: if a block needs to be moved between disk and main memory, then the time taken to perform the read/write is much larger than the time likely to be used to manipulate that data in main memory. The number of disk block reads/writes is a good approximation to the entire computation.

18 Example. Sorting on disk Each tuple (with a key) takes 160 bytes Each block holds 100 tuples (16KB) A relation R has 10M tuples (1.6 GB, 100K blocks) Main memory has 100MB (6400 blocks) A disk read/write: 40 ms

19 Main memory sorting algorithms heap sort: 10M * log 2 (10M) = 230M disk block read/write = 9200M ms = 9200000 seconds > 100 day quick sort and merge sort: 2 * 100K (blocks) * log 2 (10M) = 4.6M disk block read/write = 184M ms = 184000 seconds > 2 day

20 Two-phase Multiway MergeSort Phase 1. making sorted sublist repeat fill the main memory with remaining tuples in R and sort them; write the sorted sublist (of 6400 blocks) back to disk Phase 2. Merging repeat bring in a block from each of the sorted sublist; merge them and put in an “output” block; write the “output” block back to disk when it is full

21 Two-phase Multiway MergeSort # sublists = 100K/6400 = 16 thus, in phase 2, we can easily hold a block for each sublist in the main memory Disk block read/write: 100K (blocks) * 4 = 400K disk block read/write = 16M ms = 16000 seconds < 4.5 hours


Download ppt "CPSC-608 Database Systems Fall 2009 Instructor: Jianer Chen Office: HRBB 309B Phone: 845-4259 Notes #5."

Similar presentations


Ads by Google