1Computer Sciences Department
Book: Introduction to Algorithms, by: Thomas H. Cormen Charles E. Leiserson Ronald L. Rivest Clifford Stein Electronic: engineering-and-computer-science/6-006-introduction- to-algorithms-fall-2011/ Reference 3Computer Sciences Department
The Role of Algorithms in Computing 4Computer Sciences Department
The Role of Algorithms in Computing The problem of sorting What kinds of problems are solved by algorithms? Hard problems Algorithms as a technology Insertion sort Analysis of insertion sort Example of insertion sort (Best, Worst and Average) case analysis Merge sort The divide-and-conquer approach Analyzing merge sort Example of merge sort Recursion tree Lecture Contents (objectives) 5Computer Sciences Department
What are algorithms? Why is the study of algorithms worthwhile? What is the role of algorithms relative to other technologies used in computers? The Role of Algorithms in Computing 6Computer Sciences Department
Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output. An algorithm is thus a sequence of computational steps that transform the input into the output. Algorithms 7Computer Sciences Department
Design and Analysis of Algorithms Analysis: predict the cost of an algorithm in terms of resources and performance Design: design algorithms which minimize the cost 8Computer Sciences Department
Computational problem. Algorithms In general, an instance of a problem consists of the input (satisfying whatever constraints are imposed in the problem statement) needed to compute a solution to the problem. 9Computer Sciences Department
The problem of sorting Input: sequence of numbers. Example: Input: Output: Output: permutation such that a' 1 a' 2 … a' n. 10Computer Sciences Department
An algorithm is said to be correct if, for every input instance, it halts with the correct output 11Computer Sciences Department
The Human Genome Project - has the goals of identifying all the 100,000 genes in human DNA, determining the sequences of the 3 billion chemical base pairs that make up human DNA - storing this information in databases, and developing tools for data analysis. The Internet enables people all around the world to quickly access and retrieve large amounts of information. (to manage and manipulate this large volume of data) What kinds of problems are solved by algorithms? 12Computer Sciences Department
Electronic commerce. In manufacturing and other commercial settings. What kinds of problems are solved by algorithms? (cont’d) Example: an equation ax ≡ b (mod n), where a, b, and n are integers, and we wish to find all the integers x, modulo n, that satisfy the equation. There may be zero, one, or more than one such solution. We can simply try x = 0, 1,..., n − 1 in order, but Chapter 31 shows a more efficient method. 13Computer Sciences Department
A data structure is a way to store and organize data in order to facilitate access and modifications. Data structure Technique 14Computer Sciences Department
Efficient algorithms. NP-complete problems. (decision problem) – First, although no efficient algorithm for an NP- complete problem has ever been found, nobody has ever proven that an efficient algorithm for one cannot exist. It is unknown whether or not efficient algorithms exist for NP-complete problems. Hard problems 15Computer Sciences Department
16Computer Sciences Department
Would you have any reason to study algorithms? YES. If computers were infinitely fast, any correct method for solving a problem would do. Computers may be fast, but: - memory may be cheap, but it is not free. Computing time is therefore a bounded resource, and so is space in memory. These resources should be used wisely, and algorithms that are efficient in terms of time or space will help you do so. Algorithms as a technology 17Computer Sciences Department
Which computer/ algorithm is faster? Very important 18Computer Sciences Department
19Computer Sciences Department
20Computer Sciences Department
Getting Started 21Computer Sciences Department
Running time The running time depends on the input: an already sorted sequence is easier to sort. Major Simplifying Convention: Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. T A (n) = time of A on length n inputs Generally, seek upper bounds on the running time, to have a guarantee of performance. 22Computer Sciences Department
Kinds of analyses Worst-case: (usually) T(n) = maximum time of algorithm on any input of size n. (upper bound on the running time) Average-case: (sometimes) T(n) = expected time of algorithm over all inputs of size n. Best-case: T(n) = minimum time of algorithm ((fastest time to complete, with optimal inputs chosen) (lower bound on the running time)) 23Computer Sciences Department
The best case, in which the input array was already sorted, and the worst case, in which the input array was reverse sorted. The worst-case running time of an algorithm is an upper bound on the running time for any input. Worst-case and average-case analysis How long does it take to determine where in sub-array A[1.. j − 1] to insert element A[ j ]? 24Computer Sciences Department
Suppose that we randomly choose n numbers and apply insertion sort. How long does it take to determine where in sub-array A[1.. j − 1] to insert element A[ j ]? On average, half the elements in A[1.. j − 1] are less than A[ j ], and half the elements are greater. On average, therefore, we check half of the sub-array A[1.. j − 1], so t j = j/2. Average case 25Computer Sciences Department
Insertion sort The numbers that we wish to sort are also known as the keys Insertion sort, is an efficient algorithm for sorting a small number of elements Start One move / comparing / insert it into the correct position 26Computer Sciences Department
Insertion sort (cont’d) 27Computer Sciences Department
Insertion sort (cont’d) “Pseudo-code conventions” ij key sorted A:A: 1n Length[A]=n 28Computer Sciences Department
predicting the resources that the algorithm requires. random-access machine (RAM)- Instructions are (executed one after another, with no concurrent operations) The data types in the RAM model are integer and floating point and limit on the size of each word of data. Is exponentiation a constant time instruction? “shift left” instruction. Analyzing algorithms 29Computer Sciences Department
The time taken by the INSERTION-SORT procedure depends on the input. INSERTION SORT can take different amounts of time: - to sort two input sequences of the same size depending on how nearly sorted they already are. - to sort thousand numbers or three numbers. The running time of an algorithm on a particular input is the number of primitive operations or “steps” executed. Analysis of insertion sort 30Computer Sciences Department
Analysis of insertion sort (cont’d) Best case Worst case 31 Computer Sciences Department
Analysis of insertion sort (cont’d) 32Computer Sciences Department
Explanation 33Computer Sciences Department a b c =an 2 +bn-c
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Example of insertion sort Computer Sciences Department
Divide the n-element sequence to be sorted into two subsequences of n/2 elements each. The merge sort algorithm closely follows the divide-and- conquer paradigm. The divide-and-conquer approach 45Computer Sciences Department
46Computer Sciences Department
M ERGE -S ORT M ERGE -S ORT A[1.. n] 1.If n = 1, done. 2.Recursively sort A[ 1.. n/2 ] and A[ n/2 +1.. n ]. 3.“Merge” the 2 sorted lists. 47Computer Sciences Department
Analyzing merge sort M ERGE -S ORT A[1.. n] 1.If n = 1, done. 2.Recursively sort A[ 1.. n/2 ] and A[ n/2 +1.. n ]. 3.“Merge” the 2 sorted lists T(n) (1) 2T(n/2) (n) T(n) = (1) if n = 1; 2T(n/2) + (n) if n > 1. Recurrence for merge sort 48Computer Sciences Department
49Computer Sciences Department
MERGE(A, p, q, r), where A is an array and p, q, and r are indices numbering elements of the array such that p ≤ q < r. Merge sort 50Computer Sciences Department
51Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Computer Sciences Department
Merging two sorted arrays Time = (n) to merge a total of n elements (linear time). 64Computer Sciences Department
Merge sort - “Pseudo-code conventions” 65Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. 66Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. 67Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. T(n)T(n) 68Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. T(n/2) cn 69Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn T(n/4) cn/2 70Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … 71Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n 72Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n cn 73Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n cn 74Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n cn … 75Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n cn #leaves = n (n)(n) … 76Computer Sciences Department
Recursion tree Solve T(n) = 2T(n/2) + cn, where c > 0 is constant. cn cn/4 cn/2 (1) … h = lg n cn #leaves = n (n)(n) Total (n lg n) … 77Computer Sciences Department
Conclusions (n lg n) grows more slowly than (n 2 ). Therefore, merge sort asymptotically beats insertion sort in the worst case. In practice, merge sort beats insertion sort for n > 30 or so. 78Computer Sciences Department