Complexity Analysis, Sorting & Searching

Complexity Analysis, Sorting & Searching
1005ICT:: lecture 5 Complexity Analysis, Sorting & Searching Written by Rob Baltrusch, some parts adapted from Sun. 12 January 2018

What is complexity analysis?
For every algorithm we create, we should consider how increasing the amount of input data increases: Execution time Memory required for execution An algorithm is said to be efficient if it executes in less time and consumes less memory than another similar algorithm complexity analysis is the examination of the efficiency of an algorithm – it involves measuring the complexity (typically in terms of execution time) 12 January 2018

How do we measure complexity?
can be measured by expressing execution time for varying inputs in terms of a growth rate function the actual execution time is subject to many factors such as the compiler, CPU, operating system etc therefore we do not want to measure time complexity in terms of seconds we measure in terms of asymptotic complexity (growth rate) 12 January 2018

For example, consider the simple sum algorithm below
executionTime = t1 + n * (t2 + t3) + t4 = k1 + n * k2 = n * k2 where k1 and k2 are method dependent constants since statements are done in constant time int sum(int[] a){ int result = 0; //assignment: time = t1 for(int i = 0; i < a.length; i++){ //loop: time = t2 result += a[i]; //assignment: time = t3 } return result; //return: time = t4 12 January 2018

It is common to express the complexity in terms of Big-Oh notation
Observations: the execution time is linearly dependent on the array’s length as the array length increases, the contribution of k1 is negligible if we double the length of the array, we (roughly) double the execution time It is common to express the complexity in terms of Big-Oh notation Therefore, with the previous sum algorithm: executionTime = O(n) (read like this: “execution time is of order n”) 12 January 2018

Big-Oh notation Big-Oh describes the upper bound complexity of an algorithm, for example f(n) is O(log n) for all n >= 1 O(log n) Time complexity f(n) n 12 January 2018

no distinction is made between:
Big-Oh is upper bounded (i.e., algorithm complexity should be no greater than the asymptotic function of n) no distinction is made between: executionTime = *n , which is O(n) executionTime = n/ , also O(n) some statements take constant time: result += a[i]; O(1) - where O(1) means some constant (could be 1, 2000, 23 etc) - should take the same time whenever this code fragment is executed – actual time not important always ignore constant factors, since they are O(1) 12 January 2018

executionTime = t1 + n * (t2 + n * (t3 + t4)) + t5
int processArray(int[] a){ int result = 0; //t1 for(int i = 0; i < a.length; i++){ //t2 for(int j = 0; j < a.length; j++){ //t3 result += a[i] * a[j]; //t4 } return result; //t5 executionTime = t1 + n * (t2 + n * (t3 + t4)) + t5 = k1 + n * (k2 + n * k3)) + k4 = n * n = O(n2) Therefore, the algorithm is of order n2 12 January 2018

in the previous example, we can see that f(n) is O(n2)
Time complexity f(n) = k1 + n*(k2 + n*k3)) + k4 n 12 January 2018

Names of common Big-Oh values
O(1) constant O(log n) logarithmic O(n) linear O(n log n) n log n O(n2) quadratic O(n3) Cubic O(2n) Exponential 12 January 2018

if f(n) is O(n) then f(n) is also O(n2)
O(n log n) O(1) O(log n) Comments: if f(n) is O(n) then f(n) is also O(n2) since Big-Oh is upper bound, always use the largest term, e.g O(n2 + n) = O(n2) O(2n) O(1) constant O(log n) logarithmic O(n) linear O(n log n) n log n O(n2) quadratic O(n3) Cubic O(2n) Exponential Time complexity O(n3) O(n2) n 12 January 2018

Exercise Using Big-Oh notation, describe the complexity of the following algorithm: //t1 //t2 //t3 //t4 //t5 executionTime = t1 + n * (t2 + ((n - 1) / 2) * (t3 + t4)) + t5 = k1 + n * (k2 + ((n - 1) / 2) * k3) + k4 = n * (n - 1) / 2 = n * n = O(n2) int sumMatrix(int[][] a){ int sum = 0; for(int i = 0; i < a.length; i++){ for(int j = 0; j < (a.length / 2); j++){ sum += a[i][j]; } return sum; 12 January 2018

give a simplified Big-Oh order for the following run times for algorithms with input n:
10n3 + 6n 5 n log n + 30n n + 3 n / log n 12 January 2018

Search algorithms There are many algorithms used for searching for elements in a data structure (e.g array, HashSet, ArrayList, TreeMap etc) For the purposes of this lecture, we will limit ourselves to searching for an element in an array of primitives, however, these algorithms can be adapted to other data structures In particular, the algorithms we shall discuss are: linear search binary search 12 January 2018

linear search worst: O(n), average: O(n), best: O(1)
//returns index of search value (key) in array (a) else -1 if not found int search(int[] a, int key){ for(int i = 0; i < a.length; i++){ if(a[i] == key){ return i; } return -1; execution time may vary depending on whether it is found immediately (best case), half the values are searched (average case), or not found (worst case): bestCase = t1 + t2 + t3 = O(1) averageCase = (n / 2) * (t1 + t2) + t3 = O(n) worstCase = n * (t1 + t2) = O(n) 12 January 2018

binary search worst: O(log n), average: O(log n), best: O(1)
the linear search algorithm presented earlier assumes the array is disordered – what if it was ordered? binary search assumes the array is ordered there are 2 versions: iterative recursive 12 January 2018

iterative binary search
//returns index of search value (key) in array (a) else -1 if not found //pre: array a is in ascending order int search(int[] a, int key){ int left, middle, right; left = 0; right = a.length; while(left <= right){ middle = (left + right) / 2; if(a[middle] == key){ return middle; }else if(a[middle] < key){ left = middle + 1; }else{ right = middle – 1; } return -1; 12 January 2018

recursive binary search
//returns index of search value (key) in array (a) else -1 if not found //pre: array a is in ascending order public int search(int[] a, int key, int left, int right){ int middle; if(left > right){ return -1; }else{ middle = (left + right) / 2; if(a[middle] == key){ return middle; }else if(a[middle] < key){ return search(a, key, middle + 1, right); return search(a, key, left, middle - 1); } 12 January 2018

Binary search is harder to analyse because it doesn’t have a for loop
search interval halves each time we iterate the search. The sequence of search intervals are: n, n/2, n/4, ..., 8, 4, 2, 1 It’s not obvious how long this sequence is, but if we take logs, it is: log2n, log2n – 1, log2n – 2, ..., 3, 2, 1, 0 12 January 2018

Therefore binary search is an O(log2n) algorithm
Since the second sequence decrements by 1 each time down to 0, its length must be log2n + 1. It takes only constant time to do each test of binary search, so the total running time is just the number of times that we iterate, which is log2n + 1 Therefore binary search is an O(log2n) algorithm Since the base of the log doesn’t matter in an asymptotic bound, we can write that binary search is O(log n) 12 January 2018

Sort algorithms again, we will limit ourselves to sorting an array of primitives, however, these algorithms can be adapted to other data structures The sort algorithms we shall discuss are: bubble sort selection sort insertion sort merge sort quick sort 12 January 2018

Many common sort algorithms are used in computer science
Many common sort algorithms are used in computer science. They are often classified by: Computational complexity (worst, average and best-case behaviour) in terms of the size of the list (n). Typically, good behaviour is O(n log n) and bad behaviour is O(n2) Memory usage (and use of other computer resources) Stability: a sort algorithm is stable if, whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list 12 January 2018

Some sorting algorithms follow (n is the number of objects to be sorted, k the size of the key space): 12 January 2018

bubble sort worst: O(n2), average: O(n2), best: O(n)
Bubble sort is a simple sorting algorithm It takes a lot of passes through the list to be sorted, comparing two items at a time, swapping these two items in the correct order if necessary Bubble sort gets its name from the fact that the items that belong at the top of the list gradually "float" up there 12 January 2018

Bubble sort needs O(n2) comparisons to sort n items and can sort in place.
It is one of the simplest sorting algorithms to understand but is generally too inefficient for serious work sorting large numbers of elements It is essentially equivalent to insertion sort - it compares and swaps the same pairs of elements, just in a different order Naive implementations of bubble sort (like those in the following slides) usually perform badly on already-sorted lists (O(n2)), while insertion sort needs only O(n) operations in this case 12 January 2018

improving efficiency:
It is possible to reduce the best case complexity to O(n) if a flag is used to denote whether any swaps were necessary during the first run of the inner loop. In this case, no swaps would indicate an already sorted list 12 January 2018

The bubble sort algorithm is:
1. Compare adjacent elements. If the first is greater than the second, swap them 2. Do this for each pair of adjacent elements, starting with the first two and ending with the last two. At this point the last element should be the greatest 3. Repeat the steps for all elements except the last one 4. Keep repeating for one fewer elements each time, until you have no more pairs to compare 12 January 2018

A single pass through inner for loop...
end-element 42 23 74 11 65 58 94 36 99 87 23 42 74 11 65 58 94 36 99 87 23 42 11 74 65 58 94 36 99 87 23 42 11 65 74 58 94 36 99 87 23 42 11 65 58 74 94 36 99 87 23 42 11 65 58 74 36 94 99 87 end-element 23 42 11 65 58 74 36 94 87 99 12 January 2018

Array index: 1 2 3 4 5 6 7 8 9 Original array: 7 5 3 9 4 6 1 2 5 3
1 2 3 4 5 6 7 8 9 Original array: 7 5 3 9 4 6 1 2 5 3 Final array: 7 5 3 9 4 6 1 2 5 3 i j void bubbleSort(int[] a){ int i, j, temp; for(i = 0; i < a.length – 1; i++){ for(j = i + 1; j < a.length; j++){ if(a[j] < a[i]){ temp = a[j]; a[j] = a[i]; a[i] = temp; } 12 January 2018

selection sort worst: O(n2), average: O(n2), best: O(n2)
The selection sort is one of the easiest ways to sort data Rather than swapping neighbours continuously, this algorithm finds the smallest element of the array and interchanges it with the element in the first position of the array After that it re-examines the remaining elements in the array to find the second smallest element The element, which is found, is interchanged with the element in the second position of the array This process continues until all elements are placed in their proper order The order would be defined by the user (i.e. descending or ascending). 12 January 2018

For each position in the array:
1. Scan the unsorted part of the data 2. Select the smallest value 3. Switch the smallest value with the first value in the unsorted part of the data 12 January 2018

void selectionSort(int[] a){
for (int i = 0; i < a.length - 1; i++){ int minIndex = i; for (int j = i + 1; j < a.length; j++){ if (a[j] < a[minIndex]){ minIndex = j; } if (minIndex != i){ int temp = a[i]; a[i] = a[minIndex]; a[minIndex] = temp; 12 January 2018

Selection sort is O(n 2 ) in the best, worst, and expected cases
The selection sort is easy to understand and this makes it easy to implement the algorithm correctly and thus makes the sorting procedure easy to write 12 January 2018

However, there are some reasons why programmers refrain from using this algorithm:
1. This program's performance of O(n2 ) would become quite slow on large amounts of data 2. At present many application databases have sorted lists of data and these lists are updated on a regular basis. Most times these databases are, for the most part, in order. An ideal method would be able to recognize this fact and would work only on the unsorted items. The selection sort is unable to do this 12 January 2018

insertion sort worst: O(n2), average: O(n2), best: O(n)
Insertion sort is a simple sort algorithm where the result is built up one entry at a time In abstract terms, each iteration of an insertion sort removes an element from the input data, inserting it at the correct position in the already sorted list, until no elements are left in the input The choice of which element to remove from the input is arbitrary 12 January 2018

Sorting is done in-place
Sorting is done in-place. The result array after n iterations contains the first n entries of the input array and is sorted. In each step, the first remaining entry of the input is removed, inserted into the result at the right position, thus extending the result (with each element > x copied to the right as it is compared against the item in yellow): iteration n: 1 3 7 9 4 10 2 6 iteration n + 1: 1 3 4 7 9 10 2 6 12 January 2018

The algorithm can be described as:
1. Start with the result being the first element of the input 2. Loop over the input until it is empty, "removing" the first remaining (leftmost) element 3. Compare the removed element against the current result, starting from the highest (rightmost) element, and working left towards the lowest element 12 January 2018

4. If the removed input element is lower than the current result element, copy that value into the following element to make room for the new element below, and repeat with the next lowest result element 5. Otherwise, the new element is in the correct location; save it in the cell left by copying the last examined result up, and start again from (2) with the next input element 12 January 2018

Insertion sort is very similar to bubble sort
In bubble sort, after n passes through the array, the n largest elements have bubbled to the top (Or the n smallest elements have bubbled to the bottom, depending on which way you do it.) In insertion sort, after n passes through the array, you have a run of n sorted elements at the bottom of the array. Each pass inserts another element into the sorted run. So with bubble sort, each pass takes less time than the previous one, but with insertion sort, each pass may take more time than the previous one. 12 January 2018

In the best case of an already sorted array, this implementation of insertion sort takes O(n) time:
in each iteration, the first remaining element of the input is only compared with the last element of the result It takes O(n2) time in the average and worst cases, which makes it impractical for sorting large numbers of elements However, insertion sort's inner loop is very fast, which often makes it one of the fastest algorithms for sorting small numbers of elements, typically less than 10 or so 12 January 2018

void insertionSort(int[] a){ for (int i = 1; i < a.length; i++){
int itemToInsert = a[i]; int j = i - 1; while (j >= 0){ if (itemToInsert < a[j]){ a[j + 1] = a[j]; j--; }else{ break; } a[j + 1] = itemToInsert; 12 January 2018

merge sort worst: O(n log n), average: O(n log n), best: O(n log n)
The sorting algorithm Mergesort produces a sorted sequence by sorting its two halves and merging them With a time complexity of O(n log n), mergesort is optimal 12 January 2018

Similar to Quicksort, the Mergesort algorithm is based on a divide and conquer strategy
First, the sequence to be sorted is decomposed into two halves (Divide) Each half is sorted independently (Conquer) Then the two sorted halves are merged to a sorted sequence (Combine) Divide Conquer Combine array to be sorted mergeSort(n/2) mergeSort(n) mergeSort(n/2) 12 January 2018

private void mergeSort(int[] a, int left, int right){
if(left < right){ int middle = (left + right) / 2; mergeSort(a, left, middle); mergeSort(a, middle + 1, right); merge(a, left, middle, right); } private void merge(int[] a, int left, int middle, int right){ int b[] = new int[right - left + 1]; // Merge a[left..middle] and a[middle+1..right] to b[0..right-left] int j = left, k = middle + 1, i; for(i = 0; i <= right - left; i++){ if(j > middle){ b[i] = a[k++]; }else if(k > right){ b[i] = a[j++]; }else if(a[j] <= a[k]){ }else{ // Copy b[0..right-left] to a[left..right] for(i = 0, j = left; i <= right - left; i++, j++){ a[j] = b[i]; 12 January 2018

Procedure merge requires 2n steps (n steps for copying the sequence to the intermediate array b, another n steps for copying it back to array a) Providing a proof for the complexity is beyond the scope of this course, however, the complexity for merge sort is: worst, O(n log n) average, O(n log n) best, O(n log n) The disadvantage of merge sort is that it requires O(n) space for temporary array b 12 January 2018

Quicksort is one of the fastest and simplest sorting algorithms
Quicksort is one of the fastest and simplest sorting algorithms. It works recursively by a divide-and-conquer strategy: 1) Divide the problem is decomposed into subproblems 2) Conquer the subproblems are solved 3) Combine the solutions of the subproblems are recombined to the solution of the original problem 12 January 2018

Recombination of the two parts yields the sorted sequence (combine)
First, the sequence to be sorted a is partitioned into two parts, such that all elements of the first part b are less than or equal to all elements of the second part c (divide) Then the two parts are sorted separately by recursive application of the same procedure (conquer) Recombination of the two parts yields the sorted sequence (combine) Divide Conquer Combine partition(n) quickSort(k) mergeSort(n) quickSort(n-k) b <= c a unsorted a’ sorted b’ sorted c’ sorted b’ <= c’ 12 January 2018

void quickSort(int[] a, int left, int right){ int temp;
if (left < right){ int pivot = a[(left + right) / 2]; int i = left, j = right; while (i < j){ while (a[i] < pivot){ i++; //a[i] is "large" } while (a[j] > pivot){ j--; //a[j] is "small" if (i <= j){ //exchange a[i] and a[j] temp = a[i]; a[i] = a[j]; a[j] = temp; i++; j--; // Recursively sort the "small" and //"large" subarrays of a independently quickSort(a, left, j); //sort a[left..j] quickSort(a, i, right); //sort a[i..right] 12 January 2018

The best-case behaviour of the Quicksort algorithm, O(n log n), occurs when in each recursion step the partitioning produces two parts of equal length - see (a) below The worst case, O(n2), occurs when in each recursion step an unbalanced partitioning is produced - see (c) below 12 January 2018

Complexity Analysis, Sorting & Searching

Similar presentations

Presentation on theme: "Complexity Analysis, Sorting & Searching"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Complexity Analysis, Sorting & Searching

Similar presentations

Presentation on theme: "Complexity Analysis, Sorting & Searching"— Presentation transcript:

Similar presentations

About project

Feedback