# Lecture 1 -- 1Computer Science I - Martin Hardwick Merge Sort Algorithm rMerge sort has two phases. l First it divides the data into smaller and smaller.

## Presentation on theme: "Lecture 1 -- 1Computer Science I - Martin Hardwick Merge Sort Algorithm rMerge sort has two phases. l First it divides the data into smaller and smaller."— Presentation transcript:

Lecture 1 -- 1Computer Science I - Martin Hardwick Merge Sort Algorithm rMerge sort has two phases. l First it divides the data into smaller and smaller lists until they are size 2 or 1 l Second as it returns it merges all the lists using a merge algorithm rConsider the following data set. –78 45 34 20 18 15 96 10 l The first two sub lists for merging will be –45 78 –20 34 l After merge –20 34 45 78 l The next two sub-lists are –15 18 –10 96 l After merge –10 15 18 96 l Final merge yields –10 15 18 20 34 45 78 96

Lecture 1 -- 2Computer Science I - Martin Hardwick Merge vector merge( vector list1, vector list2) // Merge two sorted lists { vector result; int i1 = 0; int i2 = 0; while (i1 < list1.size() || i2 < list2.size()) { if (i1 < list1.size() && i2 < list2.size()) { if (list1[i1] < list2[i2]) result.push_back (list1[i1++]); else result.push_back (list2[i2++]); } else { while (i1 < list1.size()) result.push_back (list1[i1++]); while (i2 < list2.size()) result.push_back (list2[i2++]); } return result; } rThis algorithm picks the smallest item from each list. rWhen it reaches the end of one list it then fills in the remaining items from the other list. rConsider l List1: 18 23 45 78 l List2: 19 21 80 90 rList 1 will be consumed first l Result: 18 19 21 23 45 78 l I1 = 4 l I2 = 2 rWe then add 80 and 90 from list 2 l Result:18 19 21 23 45 78 80 90 l I1 = 4 l I2 = 4

Lecture 1 -- 3Computer Science I - Martin Hardwick Data management issues rThe given algorithm is not very efficient. l It adds too many items to too many vectors using push_back. l The system may run out of space and have to garbage collect rA more efficient approach is to define the space needed at the beginning of the program l Either –Create a vector with a specific size vector result (10000); –Use ordinary arrays with enough size int result [1000]; rAnother issue for the largest test (128,000) items is running out of memory in your program. To fix this: l Goto Project/Properties l Select Linker/System l Set the stack sizes to 1,000,000,000

Lecture 1 -- 4Computer Science I - Martin Hardwick Memory management rThe memory is divided into three areas rStatic area (not shown) l This area is of fixed size l It stores the program code and any static (global) variables rStack area l This area stores the stack frames of all the currently executing functions l It needs to be big enough for all of the local data for all the functions rHeap area l This area stores all the data whose size cannot be predicted Call stack Frame Call stack Frame Call stack Frame Call stack Frame Call stack Frame gap top

Lecture 1 -- 5Computer Science I - Martin Hardwick Memory management rIf the heap grows too large then it will collide with the stack and the system will run out of memory rThe heap contains random items of random size some of which are no longer used l A garbage collector can go and find these unused locations (pass 1) l Compress all the space by squeezing out the gaps (pass 2) l Leaving new space at the top of the heap rHowever, running the garbage collector is very expensive l We can help by not wasting memory l By telling the system when something is going to grow big vector result (10000) l So that the system does not waste space by first creating a small vector, then a middle size copy, then a large, then a very large, then an enormous copy.

Lecture 1 -- 6Computer Science I - Martin Hardwick Memory management rA vector object is divided into two components rThe header containing fixed size information l Current number of elements l Pointer to data elements l Stored on the stack rData elements l The data items in sequence stored in the heap rMore on this CS 2 l How to make and use pointers l How to get your own data on the heap. Vector V1 Header elements gap top Vector V2 Header elements Data elements for v1

Lecture 1 -- 7Computer Science I - Martin Hardwick Merge made more efficient vector merge( vector list1, vector list2) // Merge two sorted lists { vector result (list1.size() + list2.size()); int i1 = 0; int i2 = 0; int resi = 0; while (i1 < list1.size() || i2 < list2.size()) { if (i1 < list1.size() && i2 < list2.size()) { if (list1[i1] < list2[i2]) result[resi++] = list1[i1++]; else result[resi++] = list2[i2++]; } else { while (i1 < list1.size()) result[resi++] = list1[i1++]; while (i2 < list2.size()) result[resi++] = list2[i2++]; } return result; } rThis algorithm sets a size for the new list. rTherefore, it does not need to use push_back rStill slow compared to array solution however. rArrays always use a big block of contiguous memory. rPaging is not such an issue. r(Paging occurs when the OS has to get data for your program from the disk)

Lecture 1 -- 8Computer Science I - Martin Hardwick Operator Overloading bool operator >(acct a, acct b) { return a.get_num() > b.get_num(); } ostream& operator<< (ostream &s, acct a) { os << Name: << a.get_name(); os << Balance: << a.get_bal(); return os; } acct operator + (acct a, acct b) { return acct (a.get_num(), a.get_name(), a.get_bal() + b.get_bal()); } rRemember the bank account example. rWe can enrich this example by using operator overloading rThe code on the left defines l The meaning of > for bank accounts l A special version of << for bank accounts l A plus function for bank accounts that returns the value of a with bs balance added

Lecture 1 -- 9Computer Science I - Martin Hardwick Operator overloading and sort class acct {// bank account data private: intnum;// account number stringname;// owner of account doublebalance;// balance in account public: acct (); acct (int anum, string aname, double abal); double get_bal (); string get_name(); int get_num (); void put_num (int num); void put_name (string name); void put_bal (double bal); bool is_bankrupt(); bool operator<(acct b); bool operator>(acct b); }; rThere is a sort function in that can be used to sort a vector of any type of data. rAll we have to do is define the > and < operators for this data

Lecture 1 -- 10Computer Science I - Martin Hardwick Implementation bool acct::operator< (acct b) { return get_name() < b.get_name(); } bool acct::operator> (acct b) { return get_name() > b.get_name(); } rIn the implementation we give a meaning to the > and < operators. rIn this case we are defining them using the alphabetic order of the names. rThese are object functions so get_name() returns the name of this object.

Lecture 1 -- 11Computer Science I - Martin Hardwick Usage else if (command == "sort") {// LIST COMMAND for (loc=0; loc < my_bank.size () - 1; loc++) { if (my_bank.get (loc) > my_bank.get (loc + 1)) cout << "Bank is out of order at location " << loc << endl; } vector temp = my_bank.get_all(); sort (temp.begin(), temp.end()); my_bank.put_all (temp); cout << Sorted List of accounts:" << endl; for (loc=0; loc < my_bank.size (); loc++) { account = my_bank.get (loc); cout << "Account: " << account.get_num() << \tOwner: " << account.get_name() << \tBalance: " << account.get_bal() << endl; } rYou must include at the top of your program. rThis code tests for unsorted data. rThen it does a sort rThen it prints the new data rAfter the sort the accounts will be sorted by name! Not by number. l Do not try using the insert functionality in the bank account example after doing this sort l A better solution will put the sort into a member function on the bank

Download ppt "Lecture 1 -- 1Computer Science I - Martin Hardwick Merge Sort Algorithm rMerge sort has two phases. l First it divides the data into smaller and smaller."

Similar presentations