Searching, Sorting, and Asymptotic Complexity

Slides:



Advertisements
Similar presentations
CSE 373: Data Structures and Algorithms Lecture 5: Math Review/Asymptotic Analysis III 1.
Advertisements

Chapter 1 – Basic Concepts
Cmpt-225 Algorithm Efficiency.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 11 CS2110 – Fall
Elementary Data Structures and Algorithms
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY CS2110 – Spring 2015 Lecture 10 size n … Constant time n ops n + 2 ops 2n + 2 ops n*n ops Pic: Natalie.
Algorithm Cost Algorithm Complexity. Algorithm Cost.
Week 2 CS 361: Advanced Data Structures and Algorithms
For Wednesday Read Weiss chapter 3, sections 1-5. This should be largely review. If you’re struggling with the C++ aspects, you may refer to Savitch, chapter.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 12 CS2110 – Fall 2009.
Lecture 2 Computational Complexity
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Analysis of Algorithms
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
ASYMPTOTIC COMPLEXITY CS2111 CS2110 – Fall
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
RUNNING TIME 10.4 – 10.5 (P. 551 – 555). RUNNING TIME analysis of algorithms involves analyzing their effectiveness analysis of algorithms involves analyzing.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 10 CS2110 – Fall
27-Jan-16 Analysis of Algorithms. 2 Time and space To analyze an algorithm means: developing a formula for predicting how fast an algorithm is, based.
0 Introduction to asymptotic complexity Search algorithms You are responsible for: Weiss, chapter 5, as follows: 5.1 What is algorithmic analysis? 5.2.
1 Chapter 2 Algorithm Analysis All sections. 2 Complexity Analysis Measures efficiency (time and memory) of algorithms and programs –Can be used for the.
1 Chapter 2 Algorithm Analysis Reading: Chapter 2.
Algorithmic Foundations COMP108 COMP108 Algorithmic Foundations Algorithm efficiency Prudence Wong
Lecture 3COMPSCI.220.S1.T Running Time: Estimation Rules Running time is proportional to the most significant term in T(n) Once a problem size.
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
Algorithm Analysis 1.
Complexity Analysis (Part I)
CMPT 438 Algorithms.
Chapter 2 Algorithm Analysis
Mathematical Foundation
Introduction to Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
GC 211:Data Structures Week 2: Algorithm Analysis Tools
COMP108 Algorithmic Foundations Algorithm efficiency
Introduction to Algorithms
Big-O notation.
CSE 373: Data Structures and Algorithms Pep Talk; Algorithm Analysis
CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis Catie Baker Spring 2015.
David Kauchak CS52 – Spring 2015
Algorithm Analysis and Modeling
CS 3343: Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
Algorithm design and Analysis
GC 211:Data Structures Algorithm Analysis Tools
Algorithm Efficiency Chapter 10.
Searching, Sorting, and Asymptotic Complexity
CS 201 Fundamental Structures of Computer Science
Analysis of Algorithms
Chapter 2.
Fundamentals of the Analysis of Algorithm Efficiency
Searching, Sorting, and Asymptotic Complexity
CSE 373: Data Structures & Algorithms
Asymptotic complexity Searching/sorting
Analysis of Algorithms
Asymptotic complexity
Analysis of Algorithms
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
David Kauchak cs161 Summer 2009
CS210- Lecture 2 Jun 2, 2005 Announcements Questions
Complexity Analysis (Part II)
Estimating Algorithm Performance
Searching and Sorting Hint at Asymptotic Complexity
Complexity Analysis (Part I)
Analysis of Algorithms
Analysis of Algorithms
Complexity Analysis (Part I)
Presentation transcript:

Searching, Sorting, and Asymptotic Complexity Lecture 13 CS2110 – Fall 2014

Prelim 1 Tuesday, March 11. 5:30pm or 7:30pm. The review sheet is on the website, There will be a review session on Sunday 1-3. If you have a conflict, meaning you cannot take it at 5:30 or at 7:30, they contact me (or Maria Witlox) with your issue.

Readings, Homework Textbook: Chapter 4 Homework: Recall our discussion of linked lists from two weeks ago. What is the worst case complexity for appending N items on a linked list? For testing to see if the list contains X? What would be the best case complexity for these operations? If we were going to talk about O() complexity for a list, which of these makes more sense: worst, average or best-case complexity? Why?

What Makes a Good Algorithm? Suppose you have two possible algorithms or data structures that basically do the same thing; which is better? Well… what do we mean by better? Faster? Less space? Easier to code? Easier to maintain? Required for homework? How do we measure time and space for an algorithm?

Sample Problem: Searching Determine if sorted array b contains integer v First solution: Linear Search (check each element) /** return true iff v is in b */ static boolean find(int[] b, int v) { for (int i = 0; i < b.length; i++) { if (b[i] == v) return true; } return false; Doesn’t make use of fact that b is sorted. static boolean find(int[] b, int v) { for (int x : b) { if (x == v) return true; } return false;

Sample Problem: Searching static boolean find (int[] a, int v) { int low= 0; int high= a.length - 1; while (low <= high) { int mid = (low + high)/2; if (a[mid] == v) return true; if (a[mid] < v) low= mid + 1; else high= mid - 1; } return false; Second solution: Binary Search Still returning true iff v is in a Keep true: all occurrences of v are in b[low..high]

Linear Search vs Binary Search Which one is better? Linear: easier to program Binary: faster… isn’t it? How do we measure speed? Experiment? Proof? What inputs do we use? Simplifying assumption #1: Use size of input rather than input itself For sample search problem, input size is n where n is array size Simplifying assumption #2: Count number of “basic steps” rather than computing exact times

One Basic Step = One Time Unit Input/output of scalar value Access value of scalar variable, array element, or object field assign to variable, array element, or object field do one arithmetic or logical operation method invocation (not counting arg evaluation and execution of method body) For conditional: number of basic steps on branch that is executed For loop: (number of basic steps in loop body) * (number of iterations) For method: number of basic steps in method body (include steps needed to prepare stack-frame)

Runtime vs Number of Basic Steps Is this cheating? The runtime is not the same as number of basic steps Time per basic step varies depending on computer, compiler, details of code… Well … yes, in a way But the number of basic steps is proportional to the actual runtime Which is better? n or n2 time? 100 n or n2 time? 10,000 n or n2 time? As n gets large, multiplicative constants become less important Simplifying assumption #3: Ignore multiplicative constants

Using Big-O to Hide Constants We say f(n) is order of g(n) if f(n) is bounded by a constant times g(n) Notation: f(n) is O(g(n)) Roughly, f(n) is O(g(n)) means that f(n) grows like g(n) or slower, to within a constant factor "Constant" means fixed and independent of n Example: (n2 + n) is O(n2) We know n ≤ n2 for n ≥1 So n2 + n ≤ 2 n2 for n ≥1 So by definition, n2 + n is O(n2) for c=2 and N=1 Formal definition: f(n) is O(g(n)) if there exist constants c and N such that for all n ≥ N, f(n) ≤ c·g(n)

A Graphical View To prove that f(n) is O(g(n)): c·g(n) f(n) N 11 c·g(n) f(n) N To prove that f(n) is O(g(n)): Find N and c such that f(n) ≤ c g(n) for all n > N Pair (c, N) is a witness pair for proving that f(n) is O(g(n))

Big-O Examples Claim: 100 n + log n is O(n) Claim: logB n is O(logA n) We know log n ≤ n for n ≥ 1 So 100 n + log n ≤ 101 n for n ≥ 1 So by definition, 100 n + log n is O(n) for c = 101 and N = 1 Claim: logB n is O(logA n) since logB n = (logB A)(logA n) Question: Which grows faster: n or log n?

Big-O Examples Let f(n) = 3n2 + 6n – 7 f(n) is O(n2) f(n) is O(n3) … g(n) = 4 n log n + 34 n – 89 g(n) is O(n log n) g(n) is O(n2) h(n) = 20·2n + 40n h(n) is O(2n) a(n) = 34 a(n) is O(1) Only the leading term (the term that grows most rapidly) matters

Problem-Size Examples Consisider a computing device that can execute 1000 operations per second; how large a problem can we solve? 1 second 1 minute 1 hour n 1000 60,000 3,600,000 n log n 140 4893 200,000 n2 31 244 1897 3n2 18 144 1096 n3 10 39 153 2n 9 15 21

Commonly Seen Time Bounds constant excellent O(log n) logarithmic O(n) linear good O(n log n) n log n pretty good O(n2) quadratic OK O(n3) cubic maybe OK O(2n) exponential too slow

Worst-Case/Expected-Case Bounds May be difficult to determine time bounds for all imaginable inputs of size n Simplifying assumption #4: Determine number of steps for either worst-case or expected-case or average case Worst-case Determine how much time is needed for the worst possible input of size n Expected-case Determine how much time is needed on average for all inputs of size n

Simplifying Assumptions Use the size of the input rather than the input itself – n Count the number of “basic steps” rather than computing exact time Ignore multiplicative constants and small inputs (order-of, big-O) Determine number of steps for either worst-case expected-case These assumptions allow us to analyze algorithms effectively

Worst-Case Analysis of Searching Binary Search // Return h that satisfies // b[0..h] <= v < b[h+1..] static bool bsearch(int[] b, int v { int h= -1; int t= b.length; while ( h != t-1 ) { int e= (h+t)/2; if (b[e] <= v) h= e; else t= e; } Linear Search // return true iff v is in b static bool find (int[] b, int v) { for (int x : b) { if (x == v) return true; } return false; worst-case time: O(n) Always takes ~(log n+1) iterations. Worst-case and expected times: O(log n)

Comparison of linear and binary search

Comparison of linear and binary search

Analysis of Matrix Multiplication Multiply n-by-n matrices A and B: Convention, matrix problems measured in terms of n, the number of rows, columns Input size is really 2n2, not n Worst-case time: O(n3) Expected-case time:O(n3) for (i = 0; i < n; i++) for (j = 0; j < n; j++) { c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] += a[i][k]*b[k][j]; }

Remarks Once you get the hang of this, you can quickly zero in on what is relevant for determining asymptotic complexity Example: you can usually ignore everything that is not in the innermost loop. Why? One difficulty: Determining runtime for recursive programs Depends on the depth of recursion

Why Bother with Runtime Analysis? Computers so fast that we can do whatever we want using simple algorithms and data structures, right? Not really – data- structure/algorithm improvements can be a very big win Scenario: A runs in n2 msec A' runs in n2/10 msec B runs in 10 n log n msec Problem of size n=103 A: 103 sec ≈ 17 minutes A': 102 sec ≈ 1.7 minutes B: 102 sec ≈ 1.7 minutes Problem of size n=106 A: 109 sec ≈ 30 years A': 108 sec ≈ 3 years B: 2·105 sec ≈ 2 days 1 day = 86,400 sec ≈ 105 sec 1,000 days ≈ 3 years

Algorithms for the Human Genome Human genome = 3.5 billion nucleotides ~ 1 Gb @1 base-pair instruction/μsec n2 → 388445 years n log n → 30.824 hours n → 1 hour

Limitations of Runtime Analysis Big-O can hide a very large constant Example: selection Example: small problems The specific problem you want to solve may not be the worst case Example: Simplex method for linear programming Your program may not be run often enough to make analysis worthwhile Example: one-shot vs. every day You may be analyzing and improving the wrong part of the program Very common situation Should use profiling tools

Summary Asymptotic complexity Searching a sorted array Used to measure of time (or space) required by an algorithm Measure of the algorithm, not the problem Searching a sorted array Linear search: O(n) worst-case time Binary search: O(log n) worst-case time Matrix operations: Note: n = number-of-rows = number-of-columns Matrix-vector product: O(n2) worst-case time Matrix-matrix multiplication: O(n3) worst-case time More later with sorting and graph algorithms