Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cosequential Processing Chapter 8. Cosequential processing model Two or more input files sorted the same way on the same keys set current record to first.

Similar presentations


Presentation on theme: "Cosequential Processing Chapter 8. Cosequential processing model Two or more input files sorted the same way on the same keys set current record to first."— Presentation transcript:

1 Cosequential Processing Chapter 8

2 Cosequential processing model Two or more input files sorted the same way on the same keys set current record to first record in each file loop till no more current records –compare all current records –“smallest” current record copied to output –read next record in all files with “smallest” current record

3 8.3 K-way merge Comparison loop: find smallest key from among k input files Maintaining a selection tree reduces the number of comparisons from k to log 2 k

4 Selection tree example 3, 5, 12, 15... 12, 15, 23, 25... 9, 14, 20, 21... 7, 8, 11, 19... 2, 6, 7, 15... 11, 16, 17, 25... 6, 13, 18, 21... 10, 16, 16, 23... list 1 list 3 list 7 list 2 list 6 list 4 list 5 list 8 3 (list 1) 11 (list 4) 9 (list 7) 2 (list 6) 3 (list 1) 2 (list 6) How will the tree look after the next record is processed?

5 8.4 Heapsort Animated demonstration: http://odin.ee.uwa.edu.au/~morris/Year2/PLDS210/heapsort.html http://odin.ee.uwa.edu.au/~morris/Year2/PLDS210/heapsort.html Items may be inserted into sorted heap using O(log 2 n) comparisons Overlapped I/O

6 8.5.6 Replacement Selection Each time a record is written from the current heap, compare it to the next incoming record to see if it can be included in the current run (i.e., if it comes after the record just written in sorted order) New records that can’t go in current run are added to secondary heap

7 Replacement selection example Secondary Heap: 8 records Primary Heap: 12 records Next record to be output Total memory capacity: 20 records Next record in input file

8 All records in secondary stack are “smaller” than the last record already written to the current output file, and therefore cannot be included in the current run Output next (root) record from primary heap, making room in memory for the next record to be read in Replacement selection example

9 Secondary Heap: 8 records Primary Heap: 12 records Next record output to current run Total memory capacity: 20 records Next record read into memory

10 If new record key is smaller than the one just output... Secondary Heap: 8 records Primary Heap: 12 records Total memory capacity: 20 records...add it to the secondary heap 9 11

11 ... and readjust both heaps Secondary Heap: 9 records Primary Heap: 11 records Total memory capacity: 20 records

12 Otherwise, add it to the primary heap and readjust, so it will be included in the current run. Secondary Heap: 8 records Primary Heap: 12 records Total memory capacity: 20 records

13 Secondary heap grows as primary heap shrinks When last record is written from primary heap, and secondary heap fills memory: –close the file for the current run –start a new run, using the current secondary heap as the primary heap Replacement selection example

14 Would the overall efficiency of the merge sort be improved by just writing the new record out to a separate file instead of keeping it in memory in a secondary heap? –How would this affect the size of the current and subsequent runs? –How would this effect the number of times each record must be read and written? Replacement selection question


Download ppt "Cosequential Processing Chapter 8. Cosequential processing model Two or more input files sorted the same way on the same keys set current record to first."

Similar presentations


Ads by Google