Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 Notes #5.

Similar presentations


Presentation on theme: "CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 Notes #5."— Presentation transcript:

1 CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 Email: chen@cse.tamu.edu Notes #5

2 secondary storage (disks) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery Graduate Database

3 secondary storage (disks) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery Graduate Database

4 Computer Memory Hierarchy

5 CPU main memory Disk controller Secondary Storage... A Typical Computer bus disks

6 Main Memory fast small capacity (gigabytes) volatile Disks slow large capacity (100’s gigabytes) non-volatile

7 Typical Disk Terms: Platter, Head, Cylinder, Track, Sector, Gap …

8 Top View Track Sector Gap

9 A “typical” disk 5 platters (thus 10 surfaces) A surface has 20,000 tracks A track has 500 sectors (million bytes) A sector has several thousand bytes Disk makes 5000 revolutions per minute (so about 10 millisecond per rotation)

10 Blocks A (logic) block = one or several sectors (typical size 16KB) Block address Physical device Cylinder # Surface # Sector

11 Disk Access Time block X in memory ? I want block X Time = Seek Time + Rotational Delay + Transfer Time + Other

12 Seek Time 3 or 5x x 1N Cylinders Traveled Time

13 Average Random Seek Time   SEEKTIME (i  j) S = N(N-1) N N i=1 j=1 j  i typical seek time: 10 ms  40 ms

14 Rotational Delay Head here Block I want Average Rotational Delay R = 1/2 revolution typical rotational delay = 8 ms

15 Transfer Rate: typical: t = 80 MB/second = 80 KB/millisecond transfer time: block size / t ~ 10/80 < 1 ms

16 Other Delays CPU time to issue I/O Contention for controller Contention for bus, memory Typical value: ≈ 0

17 Thus, reading a block of 16K bytes: Time = Seek Time + Rotational Delay + Transfer Time + Other ~ 30 ms + 8 ms + 16/80 ms + 0 ~ 40 ms

18 Main Memory fast (read/write: 10-100 nanosecond) small capacity (gigabytes) volatile Disks slow (read/write: 1~40 millisecond) large capacity (100’s gigabytes) non-volatile Disks are about 10 5 ~10 6 times slower than main memory

19 I/O Model of Computation Dominance of I/O cost: if a block needs to be moved between disk and main memory, then the time taken to perform the read/write is much larger than the time likely to be used to manipulate that data in main memory. The number of disk block reads/writes is a good approximation to the entire computation.

20 Example. Sorting on disk Each tuple (with a key) takes 160 bytes Each block holds 100 tuples (16KB) A relation R has 10M tuples (1.6 GB, 100K blocks) Main memory has 100MB (6400 blocks) A disk read/write: 40 ms

21 Main memory sorting algorithms heap sort: 10M * log 2 (10M) = 230M disk block read/write = 9200M ms = 9200000 seconds > 100 day quick sort and merge sort: 2 * 100K (blocks) * log 2 (10M) = 4.6M disk block read/write = 184M ms = 184000 seconds > 2 day

22 Two-phase Multiway MergeSort Phase 1. making sorted sublist repeat fill the main memory with remaining tuples in R and sort them; write the sorted sublist (of 6400 blocks) back to disk Phase 2. Merging repeat bring in a block from each of the sorted sublist; merge them and put in an “output” block; write the “output” block back to disk when it is full

23 Main memory Disk Two-phase Multiway MergeSort

24 Main memory Disk First Phase Two-phase Multiway MergeSort

25 Main memory Disk Sort it First Phase Two-phase Multiway MergeSort

26 Main memory Disk First Phase Two-phase Multiway MergeSort

27 Main memory Disk First Phase Two-phase Multiway MergeSort

28 Main memory Disk Sort it First Phase Two-phase Multiway MergeSort

29 Main memory Disk First Phase Two-phase Multiway MergeSort

30 Main memory Disk First Phase Two-phase Multiway MergeSort

31 Main memory Disk Sort it First Phase Two-phase Multiway MergeSort

32 Main memory Disk First Phase Two-phase Multiway MergeSort

33 Main memory Disk First Phase Two-phase Multiway MergeSort

34 Main memory Disk Sort it First Phase Two-phase Multiway MergeSort

35 Main memory Disk First Phase Two-phase Multiway MergeSort

36 Main memory Disk Second Phase

37 Main memory Disk Second Phase One block per sublist Two-phase Multiway MergeSort

38 Main memory Disk merge Two-phase Multiway MergeSort One block per sublist

39 Main memory Disk merge Two-phase Multiway MergeSort One block per sublist

40 Main memory Disk merge Two-phase Multiway MergeSort One block per sublist

41 Main memory Disk merge Two-phase Multiway MergeSort One block per sublist

42 Main memory Disk merge Two-phase Multiway MergeSort One block per sublist

43 Main memory Disk merge Two-phase Multiway MergeSort

44 # sublists = 100K/6400 = 16 thus, in phase 2, we can easily hold a block for each sublist in the main memory Disk block read/write: 100K (blocks) * 4 = 400K disk block read/write = 16M ms = 16000 seconds < 4.5 hours

45 secondary storage (disks) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery Graduate Database

46 Read Chapter 13 for more details on memory structures


Download ppt "CPSC-608 Database Systems Fall 2015 Instructor: Jianer Chen Office: HRBB 315C Phone: 845-4259 Notes #5."

Similar presentations


Ads by Google