Efficiency of Algorithms Csci 107 Lecture 6-7. Topics –Data cleanup algorithms Copy-over, shuffle-left, converging pointers –Efficiency of data cleanup.

Slides:



Advertisements
Similar presentations
CSE Lecture 3 – Algorithms I
Advertisements

Efficiency of Algorithms
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, Java Version, Third Edition.
CPSC 171 Introduction to Computer Science Efficiency of Algorithms.
The Efficiency of Algorithms
Designing Algorithms Csci 107 Lecture 4. Outline Last time Computing 1+2+…+n Adding 2 n-digit numbers Today: More algorithms Sequential search Variations.
CS0007: Introduction to Computer Programming Array Algorithms.
CS4413 Divide-and-Conquer
HST 952 Computing for Biomedical Scientists Lecture 9.
Proofs, Recursion, and Analysis of Algorithms Mathematical Structures for Computer Science Chapter 2 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProofs,
CPSC 171 Introduction to Computer Science More Efficiency of Algorithms.
Chapter 2: Fundamentals of the Analysis of Algorithm Efficiency
Efficiency of Algorithms
CS107 Introduction to Computer Science
Chapter 3 The Efficiency of Algorithms
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Third Edition Additions by Shannon Steinfadt SP’05.
Efficiency of Algorithms February 19th. Today Binary search –Algorithm and analysis Order-of-magnitude analysis of algorithm efficiency –Review Sorting.
CS107 Introduction to Computer Science Lecture 5, 6 An Introduction to Algorithms: List variables.
Hashing COMP171 Fall Hashing 2 Hash table * Support the following operations n Find n Insert n Delete. (deletions may be unnecessary in some applications)
Chapter 3: The Efficiency of Algorithms Invitation to Computer Science, C++ Version, Fourth Edition.
Efficiency of Algorithms Csci 107 Lecture 8. Last time –Data cleanup algorithms and analysis –  (1),  (n),  (n 2 ) Today –Binary search and analysis.
Algorithms and Efficiency of Algorithms February 4th.
Chapter 3: The Efficiency of Algorithms
Designing Algorithms Csci 107 Lecture 4.
Algorithm Efficiency and Sorting
Efficiency of Algorithms Csci 107 Lecture 6. Last time –Algorithm for pattern matching –Efficiency of algorithms Today –Efficiency of sequential search.
Efficiency of Algorithms February 11th. Efficiency of an algorithm worst case efficiency is the maximum number of steps that an algorithm can take for.
Searching Arrays Linear search Binary search small arrays
1 Algorithms and Analysis CS 2308 Foundations of CS II.
C++ for Engineers and Scientists Third Edition
CS107 Introduction to Computer Science Lecture 7, 8 An Introduction to Algorithms: Efficiency of algorithms.
 2006 Pearson Education, Inc. All rights reserved Searching and Sorting.
Chapter 16: Searching, Sorting, and the vector Type.
{ CS203 Lecture 7 John Hurley Cal State LA. 2 Execution Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
1 Chapter 3: Efficiency of Algorithms Quality attributes for algorithms Correctness: It should do things right No flaws in design of the algorithm Maintainability.
CHAPTER 09 Compiled by: Dr. Mohammad Omar Alhawarat Sorting & Searching.
计算机科学概述 Introduction to Computer Science 陆嘉恒 中国人民大学 信息学院
Computer Science 101 Fast Searching and Sorting. Improving Efficiency We got a better best case by tweaking the selection sort and the bubble sort We.
Lecturer: Dr. AJ Bieszczad Chapter 11 COMP 150: Introduction to Object-Oriented Programming 11-1 l Basics of Recursion l Programming with Recursion Recursion.
Efficiency of Algorithms Csci 107 Lecture 7. Last time –Data cleanup algorithms and analysis –  (1),  (n),  (n 2 ) Today –Binary search and analysis.
P-1 University of Washington Computer Programming I Lecture 15: Linear & Binary Search ©2000 UW CSE.
SEARCHING. Vocabulary List A collection of heterogeneous data (values can be different types) Dynamic in size Array A collection of homogenous data (values.
CIS3023: Programming Fundamentals for CIS Majors II Summer 2010 Ganesh Viswanathan Searching Course Lecture Slides 28 May 2010 “ Some things Man was never.
CSC 211 Data Structures Lecture 13
Searching & Sorting Programming 2. Searching Searching is the process of determining if a target item is present in a list of items, and locating it A.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
Sorting: Implementation Fundamental Data Structures and Algorithms Klaus Sutner February 24, 2004.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Chapter 11Java: an Introduction to Computer Science & Programming - Walter Savitch 1 Chapter 11 l Basics of Recursion l Programming with Recursion Recursion.
Computer Science 101 Fast Algorithms. What Is Really Fast? n O(log 2 n) O(n) O(n 2 )O(2 n )
Invitation to Computer Science 6th Edition Chapter 3 The Efficiency of Algorithms.
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Searching Topics Sequential Search Binary Search.
Searching CSE 103 Lecture 20 Wednesday, October 16, 2002 prepared by Doug Hogan.
Array Applications. Objectives Design an algorithm to load values into a table. Design an algorithm that searches a table using a sequential search. Design.
Arrays Department of Computer Science. C provides a derived data type known as ARRAYS that is used when large amounts of data has to be processed. “ an.
WHICH SEARCH OR SORT IS BETTER?. COMPARING ALGORITHMS Time efficiency refers to how long it takes an algorithm to run Space efficiency refers to the amount.
Algorithm Analysis with Big Oh ©Rick Mercer. Two Searching Algorithms  Objectives  Analyze the efficiency of algorithms  Analyze two classic algorithms.
Chapter 16: Searching, Sorting, and the vector Type.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture3.
Chapter 16: Searching, Sorting, and the vector Type
COP 3503 FALL 2012 Shayan Javed Lecture 15
Lecture – 2 on Data structures
Chapter 3: The Efficiency of Algorithms
MSIS 655 Advanced Business Applications Programming
Chapter 3: The Efficiency of Algorithms
24 Searching and Sorting.
Divide and Conquer Algorithms Part I
Invitation to Computer Science 5th Edition
Presentation transcript:

Efficiency of Algorithms Csci 107 Lecture 6-7

Topics –Data cleanup algorithms Copy-over, shuffle-left, converging pointers –Efficiency of data cleanup algorithms –Order of magnitude  (1),  (n),  (n 2 ) –Binary search

(Time) Efficiency of an algorithm worst case efficiency is the maximum number of steps that an algorithm can take for any collection of data values. Best case efficiency is the minimum number of steps that an algorithm can take any collection of data values. Average case efficiency - the efficiency averaged on all possible inputs - must assume a distribution of the input - we normally assume uniform distribution (all keys are equally probable) If the input has size n, efficiency will be a function of n

Order of Magnitude Worst-case of sequential search: –3n+5 comparisons –Are these constants accurate? Can we ignore them? Simplification: –ignore the constants, look only at the order of magnitude –n, 0.5n, 2n, 4n, 3n+5, 2n+100, 0.1n+3 ….are all linear –we say that their order of magnitude is n 3n+5 is order of magnitude n: 3n+5 =  (n) 2n +100 is order of magnitude n: 2n+100=  (n) 0.1n+3 is order of magnitude n: 0.1n+3=  (n) ….

Data Cleanup Algorithms What are they? A systematic strategy for removing errors from data. Why are they important? Errors occur in all real computing situations. How are they related to the search algorithm? To remove errors from a series of values, each value must be examined to determine if it is an error. E.g., suppose we have a list d of data values, from which we want to remove all the zeroes (they mark errors), and pack the good values to the left. Legit is the number of good values remaining when we are done. d 1 d 2 d 3 d 4 d 5 d 6 d 7 d Legit

Data Cleanup: Copy-Over algorithm Idea: Scan the list from left to right and copy non-zero values to a new list Copy-Over Algorithm (Fig 3.2) Variables: n, A1, …, An, newposition, left, B1,…,Bn Get values for n and the list of n values A1, A2, …, An Set left to 1 Set newposition to 1 While left <= n do If A left is non-zero Copy A left into B newposition (Copy it into position newposition in new list Increase left by 1 Increase newposition by 1 Else increase left by 1 Stop

Efficiency of Copy-Over Best case: –all values are zero: no copying, no extra space Worst-case: –No zero value: n elements copied, n extra space –Time:  (n) –Extra space: n

Data Cleanup: The Shuffle-Left Algorithm Idea: –go over the list from left to right. Every time we see a zero, shift all subsequent elements one position to the left. –Keep track of nb of legitimate (non-zero) entries How does this work? How many loops do we need?

Shuffle-Left Algorithm (Fig 3.1) Variables: n, A1,…,An, legit, left, right 1Get values for n and the list of n values A1, A2, …, An 2Set legit to n 3Set left to 1 4Set right to 2 5Repeat steps 6-14 until left > legit 6if A leftt ≠ 0 7Increase left by 1 8Increase right by 1 9else 10Reduce legit by 1 11Repeat until right > n 12 Copy A ight into A right-1 13 Increase right by 1 14Set right to left Stop

Exercising the Shuffle-Left Algorithm d 1 d 2 d 3 d 4 d 5 d 6 d 7 d legit

Efficiency of Shuffle-Left Space: –no extra space (except few variables) Time –Best-case No zero value: no copying ==> order of n =  (n) –Worst case All zero values: –every element thus requires copying n-1 values one to the left n x (n-1) = n 2 - n = order of n 2 =  (n 2 ) (why?) –Average case Half of the values are zero n/2 x (n-1) = (n 2 - n)/2 = order of n 2 =  (n 2 )

Data Cleanup: The Converging-Pointers Algorithm Idea: –One finger moving left to right, one moving right to left –Move left finger over non-zero values; – If encounter a zero value then Copy element at right finger into this position Shift right finger to the left

Converging Pointers Algorithm (Fig 3.3) Variables: n, A1,…, An, legit, left, right 1Get values for n and the list of n values A1, A2,…,An 2Set legit to n 3Set left to 1 4Set right to n 5Repeat steps 6-10 until left ≥ right 6 If the value of A left ≠0 then increase left by 1 7 Else 8Reduce legit by 1 9Copy the value of A right to A left 10Reduce right by 1 11if A left =0 then reduce legit by 1. 12Stop

Exercising the Converging Pointers Algorithm d 1 d 2 d 3 d 4 d 5 d 6 d 7 d legit

Efficiency of Converging Pointers Algorithm Space –No extra space used (except few variables) Time –Best-case No zero value No copying => order of n =  (n) –Worst-case All values zero: One copy at each step => n-1 copies order of n =  (n) –Average-case Half of the values are zero: n/2 copies order of n =  (n)

Data Cleanup Algorithms Copy-Over –worst-case: time  (n), extra space n –best case: time  (n), no extra space Shuffle-left –worst-case: time  (n 2 ), no extra space –Best-case: time  (n), no extra space Converging pointers –worst-case: time  (n), no extra space –Best-case: time  (n), no extra space

Order of magnitude  (n 2 ) Any algorithm that does cn 2 work for any constant c –2n 2 is order of magnitude n 2 : 2n 2 =  (n 2 ) –.5n 2 is order of magnitude n 2 :.5n 2 =  (n 2 ) –100n 2 is order of magnitude n 2: 100n 2 =  (n 2 )

Another example Problem: Suppose we have n cities and the distances between cities are stored in a table, where entry [i,j] stores the distance from city i to city j –How many distances in total? –An algorithm to write out these distances For each row 1 through n do –For each column 1 through n do »Print out the distance in this row and column –Analysis?

Comparison of  (n) and  (n 2 )  (n): n, 2n+5, 0.01n, 100n, 3n+10,..  (n 2 ): n 2, 10n 2, 0.01n 2,n 2 +3n, n 2 +10,… We do not distinguish between constants.. –Then…why do we distinguish between n and n 2 ?? –Compare the shapes: n 2 grows much faster than n Anything that is order of magnitude n 2 will eventually be larger than anything that is of order n, no matter what the constant factors are Fundamentally n 2 is more time consuming than n –  (n 2 ) is larger (less efficient) than  (n) 0.1n 2 is larger than 10n (for large enough n) n 2 is larger than 1000n (for large enough n)

The Tortoise and the Hare Does algorithm efficiency matter?? –…just buy a faster machine! Example: Pentium Pro –1GHz (10 9 instr per second), $2000 Cray computer –10000 GHz(10 13 instr per second), $30million Run a  (n) algorithm on a Pentium Run a  (n 2 ) algorithm on a Cray For what values of n is the Pentium faster? –For n > the Pentium leaves the Cray in the dust..

Searching Problem: find a target in a list of values Sequential search –Best-case :  (1) comparison target is found immediately –Worst-case:  (n) comparisons Target is not found –Average-case:  (n) comparisons Target is found in the middle Can we do better? –No…unless we have the input list in sorted order

Searching a sorted list Problem: find a target in a sorted list –How can we exploit that the list is sorted, and come up with an algorithm faster than sequential search in the worst case? –How do we search in a phone book? –Can we come up with an algorithm? Check the middle value If smaller than target, go right Otherwise go left

Binary search Get values for list, A1, A2, ….An, n, target Set start =1, set end = n Set found = NO Repeat until ?? –Set m = middle value between start and end –If target = m then Print target found at position m Set found = YES –Else if target < Am then end = m-1 Else start = m+1 If found = NO then print “Not found” End

Efficiency of binary search What is the best case? What is the worst case? –Initially the size of the list in n –After the first iteration through the repeat loop, if not found, then either start = m or end = m ==> size of the list on which we search is n/2 –Every time in the repeat loop the size of the list is halved: n, n/2, n/4,…. –How many times can a number be halved before it reaches 1?