A Computer Science Tapestry 10.1 From practice to theory and back again In theory there is no difference between theory and practice, but not in practice.

Slides:



Advertisements
Similar presentations
Recursion Chapter 14. Overview Base case and general case of recursion. A recursion is a method that calls itself. That simplifies the problem. The simpler.
Advertisements

Introduction to Algorithms Quicksort
Garfield AP Computer Science
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
HST 952 Computing for Biomedical Scientists Lecture 9.
Stephen P. Carl - CS 2421 Recursive Sorting Algorithms Reading: Chapter 5.
ISOM MIS 215 Module 7 – Sorting. ISOM Where are we? 2 Intro to Java, Course Java lang. basics Arrays Introduction NewbieProgrammersDevelopersProfessionalsDesigners.
Sorting Algorithms. Motivation Example: Phone Book Searching Example: Phone Book Searching If the phone book was in random order, we would probably never.
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
Quicksort COMP171 Fall Sorting II/ Slide 2 Introduction * Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case: O(N.
Sorting Chapter Sorting Consider list x 1, x 2, x 3, … x n We seek to arrange the elements of the list in order –Ascending or descending Some O(n.
Data Structures and Algorithms PLSD210 Sorting. Card players all know how to sort … First card is already sorted With all the rest, ¶Scan back from the.
Fundamentals of Algorithms MCS - 2 Lecture # 16. Quick Sort.
Sorting21 Recursive sorting algorithms Oh no, not again!
© 2006 Pearson Addison-Wesley. All rights reserved10-1 Chapter 10 Algorithm Efficiency and Sorting CS102 Sections 51 and 52 Marc Smith and Jim Ten Eyck.
Chapter 11 Sorting and Searching. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Examine the linear search and.
Quicksort. Quicksort I To sort a[left...right] : 1. if left < right: 1.1. Partition a[left...right] such that: all a[left...p-1] are less than a[p], and.
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
CHAPTER 11 Sorting.
Quicksort.
Quicksort.
Chapter 15 Recursive Algorithms. 2 Recursion Recursion is a programming technique in which a method can call itself to solve a problem A recursive definition.
Sorting Algorithms and Analysis Robert Duncan. Refresher on Big-O  O(2^N)Exponential  O(N^2)Quadratic  O(N log N)Linear/Log  O(N)Linear  O(log N)Log.
Searching1 Searching The truth is out there.... searching2 Serial Search Brute force algorithm: examine each array item sequentially until either: –the.
Sorting II/ Slide 1 Lecture 24 May 15, 2011 l merge-sorting l quick-sorting.
Simple Sorting Algorithms. 2 Bubble sort Compare each element (except the last one) with its neighbor to the right If they are out of order, swap them.
COMP s1 Computing 2 Complexity
SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
(c) , University of Washington
1 Data Structures and Algorithms Sorting. 2  Sorting is the process of arranging a list of items into a particular order  There must be some value on.
Searching a vector We can search for one occurrence, return true/false or the index of occurrence Search the vector starting from the beginning Stop searching.
A Computer Science Tapestry 11.1 From practice to theory and back again In theory there is no difference between theory and practice, but not in practice.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 9: Algorithm Efficiency and Sorting Data Abstraction &
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
CompSci 100e 11.1 Sorting: From Theory to Practice l Why do we study sorting?  Because we have to  Because sorting is beautiful  Example of algorithm.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
Heapsort. Heapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines.
CompSci 100e Program Design and Analysis II April 26, 2011 Prof. Rodger CompSci 100e, Spring20111.
Chapter 5 Searching and Sorting. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Chapter Objectives Examine the linear search and binary.
Data Structure Introduction.
Quicksort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
CompSci Review of Recursion with Big-Oh  Recursion is an indispensable tool in a programmer’s toolkit  Allows many complex problems to be solved.
CompSci Analyzing Algorithms  Consider three solutions to SortByFreqs, also code used in Anagram assignment  Sort, then scan looking for changes.
Chapter 6 Recursion. Solving simple problems Iteration can be replaced by a recursive function Recursion is the process of a function calling itself.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
ArrayList is a class that implements the List interface. The List interface is a blueprint for its “implementor” classes. There is another implementor.
ICS201 Lecture 21 : Sorting King Fahd University of Petroleum & Minerals College of Computer Science & Engineering Information & Computer Science Department.
PREVIOUS SORTING ALGORITHMS  BUBBLE SORT –Time Complexity: O(n 2 ) For each item, make (n –1) comparisons Gives: Comparisons = (n –1) + (n – 2)
0 Introduction to asymptotic complexity Search algorithms You are responsible for: Weiss, chapter 5, as follows: 5.1 What is algorithmic analysis? 5.2.
Quicksort This is probably the most popular sorting algorithm. It was invented by the English Scientist C.A.R. Hoare It is popular because it works well.
Chapter 9 Recursion © 2006 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.
Sorting Algorithms Written by J.J. Shepherd. Sorting Review For each one of these sorting problems we are assuming ascending order so smallest to largest.
329 3/30/98 CSE 143 Searching and Sorting [Sections 12.4, ]
Winter 2016CISC101 - Prof. McLeod1 CISC101 Reminders Assignment 5 is posted. Exercise 8 is very similar to what you will be doing with assignment 5. Exam.
Sorting and Runtime Complexity CS255. Sorting Different ways to sort: –Bubble –Exchange –Insertion –Merge –Quick –more…
Sorting Mr. Jacobs.
Recursion notes Chapter 8.
Warmup What is an abstract class?
Chapter 7 Sorting Spring 14
Data Structures and Algorithms
CSE 143 Lecture 23: quick sort.
8/04/2009 Many thanks to David Sun for some of the included slides!
From practice to theory and back again
Sub-Quadratic Sorting Algorithms
slides adapted from Marty Stepp
EE 312 Software Design and Implementation I
Presentation transcript:

A Computer Science Tapestry 10.1 From practice to theory and back again In theory there is no difference between theory and practice, but not in practice l We’ve studied binary search: requires a sorted vector  Much faster than sequential search (how much)  Add elements in sorted order or sort vector after adding l Many sorting algorithms have been well-studied  Slower ones are often “good enough” simple to implement  Some fast algorithms are better than others Always fast, fast most-of-the-time Good in practice even if flawed theoretically? l New algorithms still discovered  Quick sort in 1960, revised and updated in 1997

A Computer Science Tapestry 10.2 Tools for algorithms and programs l We can time different methods, but how to compare timings?  Different on different machines, what about “workload”?  Mathematical tools can help analyze/discuss algorithms l We often want to sort by different criteria  Sort list of stocks by price, shares traded, volume traded  Sort directories/files by size, alphabetically, or by date  Object-oriented concepts can help in implementing sorts l We often want to sort different kinds of vectors: string and int  Don’t want to duplicate the code, that leads to errors  Generic programming helps, in C++ we use templates

A Computer Science Tapestry 10.3 To code or not to code, that is the … l Should you call an existing sorting routine or write your own?  If you can, don’t rewrite code written and accessible  Sometimes you don’t know what to call  Sometimes you can’t call the existing library routine l In C++ there are standard sort functions that can be used with built-in arrays and with both vectors and tvectors  These are accessible via #include  These are robust and fast, call sort(…) or stable_sort(…) Can’t study the code, it’s not legible l We’ll use sorts in #include “sortall.h”  Work only with tvector, as efficient as standard sorts, but code is legible

A Computer Science Tapestry 10.4 On to sorting: Selection Sort l Find smallest element, move into first array location  Find next smallest element, move into second location  Generalize and repeat void SelectSort(tvector & a) // precondition: a contains a.size() elements // postcondition: elements of a are sorted { int k, index, numElts = a.size(); // invariant: a[0]..a[k-1] in final position for(k=0; k < numElts - 1; k+=1) { index = MinIndex(a,k,numElts - 1); // find min element Swap(a[k],a[index]); } l How many elements compared? Swapped?  Total number of elements examined? N + (N-1) + … + 1  How many elements swapped?  This sort is easy to code, works fine for “small” vectors

A Computer Science Tapestry 10.5 Selection Sort: The Code ( selectsort2.cpp ) void SelectSort(tvector & a) // pre: a contains a.size() elements // post: elements of a are sorted in non-decreasing order { int j,k,temp,minIndex,numElts = a.size(); // invariant: a[0]..a[k-1] in final position for(k=0; k < numElts - 1; k++) { minIndex = k; // minimal element index for(j=k+1; j < numElts; j++) { if (a[j] < a[minIndex]) { minIndex = j; // new min, store index } temp = a[k]; // swap min and k-th elements a[k] = a[minIndex]; a[minIndex] = temp; }

A Computer Science Tapestry 10.6 What changes if we sort strings? The parameter changes, the definition of temp changes  Nothing else changes, code independent of type We must be able to write a[j] < a[k] for vector a  We can use features of language to capture independence l We can have different versions of function for different array types, with same name but different parameter lists  Overloaded function: parameters different so compiler can determine which function to call  Still problems, duplicated code, new algorithm means …? l With function templates we replace duplicated code maintained by programmer with compiler generated code

A Computer Science Tapestry 10.7 Creating a function template template void SelectSort(tvector & a) // pre: a contains a.size() elements // post: elements of a are sorted in non-decreasing order { int j,k,minIndex,numElts = a.size(); Type temp; // invariant: a[0]..a[k-1] in final position for(k=0; k < numElts - 1; k++) { minIndex = k; // minimal element index for(j=k+1; j < numElts; j++) { if (a[j] < a[minIndex]) { minIndex = j; // new min, store index } temp = a[k]; // swap min and k-th elements a[k] = a[minIndex]; a[minIndex] = temp; } l When the user calls this code, different versions are compiled

A Computer Science Tapestry 10.8 Some template details l Function templates permit us to write once, use several times for several different types of vector  Template function “stamps out” real function  Maintenance is saved, code still large (why?) l What properties must hold for vector elements?  Comparable using < operator  Elements can be assigned to each other l Template functions capture property requirements in code  Part of generic programming  Some languages support this better than others

A Computer Science Tapestry 10.9 From practical to theoretical l We want a notation for discussing differences between algorithms, avoid empirical details at first  Empirical studies needed in addition to theoretical studies  As we’ll see, theory hides some details, but still works l Binary search : roughly 10 entries in a 1,000 element vector  What is exact relationship? How to capture “roughly”?  Compared to sequential/linear search? l We use O-notation, big-Oh, to capture properties but avoid details  N 2 is the same as 13N 2 is the same as 13N N  O(N 2 ), in the limit everything is the same

A Computer Science Tapestry Running 10 6 instructions/sec NO(log N)O(N)O(N log N)O(N 2 ) , , min 100, hr 1,000, day 1,000,000, min18.3 hr318 centuries

A Computer Science Tapestry What does table show? Hide? l Can we sort a million element vector with selection sort?  How can we do this, what’s missing in the table?  What are hidden constants, low-order terms? l Can we sort a billion-element vector? Are there other sorts?  We’ll see quicksort, an efficient (most of the time) method  O(N log N), what does this mean? l Sorting code for different algorithms in sortall.h/sortall.cpp  Template functions, prototypes in.h file, implementations in.cpp file, must have both (template isn’t code!!)

A Computer Science Tapestry Templates and function objects l In a templated sort function vector elements must have certain properties (as noted previously)  Comparable using operator <  Assignable using operator =  Ok for int, string, what about Date? ClockTime? l What if we want to sort by a different criteria  Sort strings by length instead of lexicographically  Sort students by age, grade, name, …  Sort stocks by price, shares traded, profit, … We can’t change how operator < works  Alternative: write sort function that does NOT use <  Alternative: encapsulate comparison in parameter, pass it

A Computer Science Tapestry Function object concept l To encapsulate comparison (like operator <) in a parameter  Need convention for parameter : name and behavior  Other issues needed in the sort function, concentrate on being clients of the sort function rather than implementors l Name convention: class/object has a method named compare  Two parameters, the vector elements being compared (might not be just vector elements, any two parameters) l Behavior convention: compare returns an int  zero if elements equal  +1 (positive) if first > second  -1 (negative) if first < second

A Computer Science Tapestry Function object example class StrLenComp { public: int compare(const string& a, const string& b) const // post: return -1/+1/0 as a.length() < b.length() { if (a.length() < b.length()) return -1; if (a.length() > b.length()) return 1; return 0; } }; // to use this: StrLenComp scomp; if (scomp.compare(“hello”, “goodbye”) < 0) … l We can use this to sort, see strlensort.cpp  Call of sort: InsertSort(vec, vec.size(), scomp);

A Computer Science Tapestry From smarter code to algorithm We’ve seen selection sort, other O(N 2 ) sorts include  Insertion sort: better on nearly sorted data, fewer comparisons, potentially more data movements (selection)  Bubble sort: slow, don’t use it, but simple to describe l Efficient sorts are trickier to code, but not too complicated  Often recursive as we’ll see, use divide and conquer  Quicksort and Mergesort are two standard examples l Mergesort divide and conquer  Divide vector in two, sort both halfs, merge together  Merging is easier because subvectors sorted, why?

A Computer Science Tapestry Quicksort, an efficient sorting algorithm l Step one, partition the vector, moving smaller elements left, larger elements right  Formally: choose a pivot element, all elements less than pivot moved to the left (of pivot), greater moved right  After partition/pivot, sort left half and sort right half originalpartition on 14partition on

A Computer Science Tapestry Quicksort details void Quick(tvector & a,int first,int last) // pre: first <= last // piv: a[first] <=... <= a[list] { int piv; if (first < last) { piv = Pivot(a,first,last); Quick(a,first,piv-1); Quick(a,piv+1,last); } // original call is Quick(a,0,a.size()-1); l How do we make progress towards basecase? What’s a good pivot versus a bad pivot? What changes?  What about the code for Pivot?  What about type of element in vector?

A Computer Science Tapestry How is Pivot similar to Dutch Flag? int Pivot(tvector & a,int first, int last) // post: returns piv so: k in [first..piv], a[k] <= a[piv] // k in (piv,last]piv, a[piv] < a[k] { int k,p=first; string piv = a[first]; for(k=first+1;k<=last;k++) { if (a[k] <= piv) <= 0) { p++; Swap(a[k],a[p]); } Swap(a[p],a[first]); return p; } l Partition around a[first], can change this later, why is p initially first?  What is invariant? kp <= >???? first last

A Computer Science Tapestry What is complexity? l We’ve used O-notation, (big-Oh) to describe algorithms  Binary search is O(log n)  Sequential search is O(n)  Selection sort is O(n 2 )  Quicksort is O(n log n) l What do these measures tell us about “real” performance?  When is selection sort better than quicksort?  What are the advantages of sequential search? l Describing the complexity of algorithms rather than implementations is important and essential  Empirical validation of theory is important too

A Computer Science Tapestry Do it fast, do it slow, can we do it at all? l Some problems can be solved quickly using a computer  Searching a sorted list l Some problems can be solved, but it takes a long time  Towers of Hanoi l Some problems can be solved, we don’t know how quickly  Traveling salesperson, optimal class scheduling l Some problems can’t be solved at all using a computer  The halting problem, first shown by Alan Turing l The halting problem: can we write one program used to determine if an arbitrary program (any program) stops?  One program that reads other programs, must work for every program being checked, computability