Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 7.11 External Sorting Access to secondary storage is orders of magnitude slower than memory access. Minimize access to secondary storage (tape or disk).

Similar presentations


Presentation on theme: "1 7.11 External Sorting Access to secondary storage is orders of magnitude slower than memory access. Minimize access to secondary storage (tape or disk)."— Presentation transcript:

1 1 7.11 External Sorting Access to secondary storage is orders of magnitude slower than memory access. Minimize access to secondary storage (tape or disk). Also may want to read data sequentially (tapes).

2 2 7.11 External Sorting Simple merge example - sorting M records at a time (M=3), with 4 tapes (T a1, T a2, T b1, T b2 ) T a1 81 94 11 ; 96 12 35 ; 17 99 28 ; 58 41 75 ; 15 T a2 T b1, T b2 empty

3 3 7.11 External Sorting T a1, T a2 empty T b1 11 81 94 ; 17 28 99 ; 15 T b2 12 35 96 ; 41 58 75 T a1 11 12 35 81 94 96 ; 15 T a2 17 28 41 58 75 99 T b1, T b2 empty

4 4 7.11 External Sorting –read M records at a time and sort internally –a set of sorted records is called a run –it will require  log(N/M)  passes, plus the initial run-constructing pass –given 10 million records of 128 bytes, and 4 M bytes of internal memory N=10*10 6, M=4*10 6 /128, # of runs = N/M = 320 # of passes =  log(N/M)  + 1= 10

5 5 7.11 External Sorting T a1, T a2 empty T b1 11 12 17 28 35 41 58 75 81 94 96 99 T b2 15 T a1 11 12 15 17 28 35 41 58 75 81 94 96 99 T a2 T b1, T b2 empty

6 6 7.11 External Sorting Multiway Merge –k input devices instead of just 2 –e.g, k=3 for the previous example T a1 81 94 11 ; 96 12 35 ; 17 99 28 ; 58 41 75 ; 15 T a2 T a3 T b1, T b2, T b3 empty

7 7 7.11 External Sorting T a1, T a2, T a3 empty T b1 11 81 94 ; 41 58 75 T b2 12 35 96 ; 15 T b3 17 28 99 T a1 11 12 17 28 35 81 94 96 99 T a2 15 41 58 75 T a3 T b1, T b2, T b3 empty

8 8 7.11 External Sorting T a1, T a2, T a3 empty T b1 11 12 15 17 28 35 41 58 75 81 94 96 99 T b2, T b3 empty –it will require  log k (N/M)  passes, plus the initial run-constructing pass –for N=10*10 6, M=4*10 6 /128, # of passes =  log 5 (10*128/4)  + 1= 5 Skip rest of Chapter 7


Download ppt "1 7.11 External Sorting Access to secondary storage is orders of magnitude slower than memory access. Minimize access to secondary storage (tape or disk)."

Similar presentations


Ads by Google