Computer Science 112 Fundamentals of Programming II Searching, Sorting, and Complexity Analysis
Things to Desire in a Program Correctness Robustness Maintainability Efficiency
Measuring Efficiency Empirical - use clock to get actual running times for different inputs Problems: Different machines have different running times Some running times are so long that it is impractical to check them
Measuring Efficiency Analytical - use pencil and paper to determine the abstract amount of work that a program does for different inputs Advantages: Machine independent Can predict running times of programs that are impractical to run
Complexity Analysis Pick an instruction that will run most often in the code Determine the number of times this instruction will be executed as a function of the size of the input data Focus on abstract units of work, not actual running time
Example: Search for the Minimum def ourMin(lyst): minSoFar = lyst[0] for item in lyst: minSoFar = min(minSoFar, item) return minSoFar We focus on the assignment (=) inside the loop and ignore the other instructions. for a list of length 1, one assignment for a list of length 2, 2 assignments . for a list of length n, n assignments
Big-O Notation Big-O notation expresses the amount of work a program does as a function of the size of the input O(N) stands for order of magnitude N, or order of N for short Search for the minimum is O(N), where N is the size of the input (a list, a number, etc.)
Common Orders of Magnitude Constant O(k) Logarithmic O(log2n) Linear O(n) Quadratic O(n2) Exponential O(kn)
Graphs of O(n) and O(n2)
Common Orders of Magnitude n O(log2n) O(n) O(n2) O(2n) 2 1 2 4 4 4 2 4 16 64 8 3 8 64 256 16 4 16 256 65536 32 5 32 1024 4294967296 64 6 64 4096 19 digits 128 7 128 16384 yikes! 256 8 256 65536 512 9 512 262144 1024 10 1024 1048576
Approximations Suppose an algorithm requires exactly 3N + 3 steps As N gets very large, the difference between N and N + K becomes negligible (where K is a constant) As N gets very large, the difference between N and N / K or N * K also becomes negligible Use the highest degree term in a polynomial and drop the others (N2 – N)/2 N2
Example Approximations n O(n) O(n) + 2 O(n2) O(n2) + n 2 2 4 4 6 4 4 6 16 20 8 8 10 64 72 16 16 18 256 272 32 32 34 1024 1056 64 64 66 4096 5050 128 128 130 16384 16512 256 256 258 65536 65792 512 512 514 262144 262656 1024 1024 1026 1048576 1049600
Example: Sequential Search def ourIn(target, lyst): for item in lyst: if item == target: return True # Found target return False # Target not there Which instruction do we pick? How fast is its rate of growth as a function of n? Is there a worst case and a best case? An average case?
Improving Search Assume data are in ascending order Goto midpoint and look there Otherwise, repeat the search to left or to right of midpoint 34 41 56 63 72 89 95 target 89 0 1 2 3 4 5 6 midpoint 3 left right
Improving Search Assume data are in ascending order Goto midpoint and look there Otherwise, repeat the search to left or to right of midpoint 34 41 56 63 72 89 95 target 89 0 1 2 3 4 5 6 midpoint 5 left right
Example: Binary Search def ourIn(target, sortedLyst): left = 0 right = len(sortedLyst) - 1 while left <= right: midpoint = (left + right) // 2 if target == sortedLyst[midpoint]: return True elif target < sortedLyst[midpoint]: right = midpoint - 1 else: left = midpoint + 1 return False
Analysis How many times will == be executed in the worst case? while left <= right: midpoint = (left + right) // 2 if target == sortedLyst[midpoint]: return True elif target < sortedLyst[midpoint]: right = midpoint - 1 else: left = midpoint + 1 How many times will == be executed in the worst case?
Sorting a List 89 56 63 72 41 34 95 0 1 2 3 4 5 6 sort 34 41 56 63 72 89 95 0 1 2 3 4 5 6
Selection Sort For each position i in the list Select the smallest element from i to n - 1 Swap it with the ith one
Trace i Step 1: find the smallest element 89 56 63 72 41 34 95 0 1 2 3 4 5 6 smallest i Step 2: swap with first element 34 56 63 72 41 89 95 0 1 2 3 4 5 6 i Step 3: advance i and goto step 1 34 56 63 72 41 89 95 0 1 2 3 4 5 6
Design of Selection Sort for each i from 0 to n - 1 minIndex = minInRange(lyst, i, n) if minIndex != i swap(lyst, i, minIndex) minInRange returns the index of the smallest element swap exchanges the elements at the specified positions
Implementation def selectionSort(lyst): n = len(lyst) for i in range(n): minIndex = minInRange(lyst, i, n) if minIndex != i: swap(lyst, i, minIndex)
Implementation def selectionSort(lyst): n = len(lyst) for i in range(n): minIndex = minInRange(lyst, i, n) if minIndex != i: swap(lyst, i, minIndex) def minInRange(lyst, i, n): minValue = lyst[i] minIndex = i for j in range(i, n): if lyst[j] < minValue: minValue = lyst[j] minIndex = j return minIndex
Implementation def selectionSort(lyst): n = len(lyst) for i in range(n): minIndex = minInRange(lyst, i, n) if minIndex != i: swap(lyst, i, minIndex) def minInRange(lyst, i, n): minValue = lyst[i] minIndex = i for j in range(i, n): if lyst[j] < minValue: minValue = lyst[j] minIndex = j return minIndex def swap(lyst, i, j): lyst[i], lyst[j] = lyst[j], lyst[i]
Analysis of Selection Sort The main loop runs approximately n times Thus, the function minInRange runs n times Within the function minInRange, a loop runs n - i times
Analysis of Selection Sort Overall, the number of comparisons performed in function minInRange is n - 1 + n - 2 + n - 3 + . . + 1 = (n2 – n) / 2 n2
Finding Faster Algorithms For Wednesday Finding Faster Algorithms