2Today Data cleanup algorithms Algorithm efficiency Copy-overShuffle-leftConverging pointersAlgorithm efficiencyEfficiency of data cleanup algorithmsReading:Chapter 3
3Comparing Algorithms Algorithm There are many ways to solve a problem DesignCorrectnessEfficiencyAlso, clarity, elegance, ease of understandingThere are many ways to solve a problemConceptuallyAlso different ways to write pseudocode for the same conceptual ideaHow to compare algorithms?
4Efficiency of Algorithms Efficiency: Amount of resources used by an algorithmSpace (number of variables)Time (number of instructions)When design algorithm must be aware of its use of resourcesIf there is a choice, pick the more efficient algorithm!
5Efficiency of Algorithms Does efficiency matter?Computers are so fast these days…Yes, efficiency matters a lot!There are problems (actually a lot of them) for which all known algorithms are so inneficient that they are impracticalRemember the shortest-path-through-all-cities problem from Lab1…
6Data Cleanup Algorithms What are they?A systematic strategy for removing errors from data.Why are they important?Errors occur in all real computing situations.How are they related to the search algorithm?To remove errors from a series of values, each value must be examined to determine if it is an error.E.g., suppose we have a list d of data values, from which we want to remove all the zeroes (they mark errors), and pack the good values to the left. Legit is the number of good values remaining when we are done.d d2 d3 d4 d5 d d7 d8Legit
7Data Cleanup: Copy-Over algorithm Idea: Scan the list from left to right and copy non-zero values to a new listCopy-Over Algorithm (Fig 3.2)Get values for n and the list of n values A1, A2, …, AnSet left to 1Set newposition to 1While left <= n doIf Aleft is non-zeroCopy A left into B newposition(Copy it into position newposition in new listIncrease left by 1Increase newposition by 1Else increase left by 1Stop
8Data Cleanup: The Shuffle-Left Algorithm Idea:go over the list from left to right. Every time we see a zero, shift all subsequent elements one position to the left.Keep track of nb of legitimate (non-zero) entriesHow does this work?How many loops do we need?
9Shuffle-Left Algorithm (Fig 3.1) Get values for n and the list of n values A1, A2, …, AnSet legit to nSet left to 1Set right to 2Repeat steps 6-14 until left > legit6 if Aleftt ≠ 07 Increase left by 18 Increase right by 19 else10 Reduce legit by 1Repeat until right > nCopy Aight into Aright-1Increase right by 114 Set right to left + 115 Stop
10Exercising the Shuffle-Left Algorithm d d2 d3 d4 d5 d d7 d8legit
11Data Cleanup: The Converging-Pointers Algorithm Idea:One finger moving left to right, one moving right to leftMove left finger over non-zero values;If encounter a zero value thenCopy element at right finger into this positionShift right finger to the left
12Converging Pointers Algorithm (Fig 3.3) Get values for n and the list of n values A1, A2,…,AnSet legit to nSet left to 1Set right to nRepeat steps 6-10 until left ≥ rightIf the value of Aleft≠0 then increase left by 1ElseReduce legit by 1Copy the value of Aright to Aleft10 Reduce right by 1if Aleft=0 then reduce legit by 1.Stop
13Exercising the Converging Pointers Algorithm d d2 d3 d4 d5 d d7 d8legit
14Efficiency of Algorithms How to measure time efficiency?Running time: let it run and see how long it takesOn what machine?On what inputs?We want a measure of time efficiency which is independent of machine, speed etcLook at an algorithm pseudocode and estimate its running timeLook at 2 algorithm pseudocodes and compare them
15Time Efficiency Is this accurate? Time efficiency depends on input (Time) efficiency of an algorithm:the number of pseudocode instructions (steps) executedIs this accurate?Not all instructions take the same amount of time…But..Good approximation of running time in most casesTime efficiency depends on inputExample: the sequential search algorithmIn the best case, how fast can the algorithm halt?In the worst case, how fast can the algorithm halt?
16Efficiency of an algorithm worst case efficiencyis the maximum number of steps that an algorithm can take for any collection of data values.Best case efficiencyis the minimum number of steps that an algorithm can take any collection of data values.Average case efficiency- the efficiency averaged on all possible inputs- must assume a distribution of the input- we normally assume uniform distribution (all keys are equally probable)If the input has size n, efficiency will be a function of n
17Worst Case Efficiency for Sequential Search Get the value of target, n, and the list of n values 1Set index toSet found to falseRepeat steps 5-8 until found = true or index > n n5 if the value of listindex = target then nOutput the indexSet found to true 08 else Increment the index by n9 if not found then10 Print a message that target was not found 0StopTotal n+5
18Analysis of Sequential Search Time efficiencyBest-case : 1 comparisontarget is found immediatelyWorst-case: 3n + 5 comparisonsTarget is not foundAverage-case: 3n/2+4 comparisonsTarget is found in the middleSpace efficiencyHow much space is used in addition to the input?
19Order of Magnitude Worst-case of sequential search: Simplification: 3n+5 comparisonsAre these constants accurate? Can we ignore them?Simplification:ignore the constants, look only at the order of magnituden, 0.5n, 2n, 4n, 3n+5, 2n+100, 0.1n+3 ….are all linearwe say that their order of magnitude is n3n+5 is order of magnitude n: n+5 = (n)2n +100 is order of magnitude n: 2n+100=(n)0.1n+3 is order of magnitude n: 0.1n+3=(n)….
20Efficiency of Copy-Over Best case:all values are zero: no copying, no extra spaceWorst-case:No zero value: n elements copied, n extra spaceTime: (n)Extra space: n
21Efficiency of Shuffle-Left Space:no extra space (except few variables)TimeBest-caseNo zero value:no copying ==> order of n = (n)Worst caseAll zero values:every element thus requires copying n-1 values one to the leftn x (n-1) = n2 - n = order of n2 = (n2) (why?)Average caseHalf of the values are zeron/2 x (n-1) = (n2 - n)/2 = order of n2 = (n2)
22Efficiency of Converging Pointers Algorithm SpaceNo extra space used (except few variables)TimeBest-caseNo zero valueNo copying => order of n = (n)Worst-caseAll values zero:One copy at each step => n-1 copiesorder of n = (n)Average-caseHalf of the values are zero:n/2 copies
23Data Cleanup Algorithms Copy-Overworst-case: time (n), extra space nbest case: time (n), no extra spaceShuffle-leftworst-case: time (n2), no extra spaceBest-case: time (n), no extra spaceConverging pointersworst-case: time (n), no extra space