# Sorting Really Big Files Sorting Part 3. Using K Temporary Files Given  N records in file F  M records will fit into internal memory  Use K temp files,

## Presentation on theme: "Sorting Really Big Files Sorting Part 3. Using K Temporary Files Given  N records in file F  M records will fit into internal memory  Use K temp files,"— Presentation transcript:

Sorting Really Big Files Sorting Part 3

Using K Temporary Files Given  N records in file F  M records will fit into internal memory  Use K temp files, where K = N / M Create K sorted files from F, then merge them Problems  computers compare 2 values at once, not K values  merging only 2 of K runs at once creates LOTS of temp files  in the illustration on the next page, notice that we soon begin merging small runs with big temp files too many comparisons

Alternative Merging Strategy R1R2 T2 R3T1 R4 F R1R2 T2 R3 T1 R4 F R5T3 R5 T3 empty S1S2 R1 = Run 1 R2 = Run 2 etc What would these trees look like with 8 runs?

N-Way Merge We can create that tree using just 4 temp files  2 are input and 2 are output, the pairs alternate being input and output files Algorithm Write Run 1 into T1 Write Run 2 into T2 Write Run 3 into T1 Write Run 4 into T2... Merge first runs in T1 and T2 into T3 Merge second runs in T1 and T2 into T4 Merge thirds runs in T1 and T2 into T3... Merge first runs in T3 and T4 into T1 Merge second runs in T3 and T4 into T2...

N-Way Merge Step Number Files Contain Runs 1 T1 - R1 R3 R5 R7 R9 T2 - R2 R4 R6 R8 R10 T3 - T4 - 2 T1 - T2 - T3 - R1-R2 R5-R6 R9-10 T4 - R3-R4 R7-R8 3 T1 - R1-R4 R9-R10 T2 - R5-R8 T3 - T4 - 4 T1 - T2 - T3 - R1-R8 T4 - R9-R10 5 T1 - R1-R10 T2 - T3 - T4 - T1T1T2 F T3T4 T1T2 T3T4

Analysis Number of Comparisons:  N-Way Merge -- O (n log 2 n)  K Temp Files -- O ( n 2 ) Disk Space Could the run size be one record?  In other words, is the internal sort necessary?

Similar presentations