Sorting Algorithms (Part II) Slightly modified definition of the sorting problem: input: A collection of n data items where data item a i has a key, k.

Slides:



Advertisements
Similar presentations
BY Lecturer: Aisha Dawood. Heapsort  O(n log n) worst case like merge sort.  Sorts in place like insertion sort.  Combines the best of both algorithms.
Advertisements

Analysis of Algorithms
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
David Luebke 1 5/20/2015 CS 332: Algorithms Quicksort.
September 19, Algorithms and Data Structures Lecture IV Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
CS 253: Algorithms Chapter 6 Heapsort Appendix B.5 Credit: Dr. George Bebis.
Analysis of Algorithms CS 477/677
More sorting algorithms: Heap sort & Radix sort. Heap Data Structure and Heap Sort (Chapter 7.6)
Comp 122, Spring 2004 Heapsort. heapsort - 2 Lin / Devi Comp 122 Heapsort  Combines the better attributes of merge sort and insertion sort. »Like merge.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 7 Heapsort and priority queues Motivation Heaps Building and maintaining heaps.
Heapsort Chapter 6. Heaps A data structure with  Nearly complete binary tree  Heap property: A[parent(i)] ≥ A[i] eg. Parent(i) { return } Left(i) {
1 Priority Queues A priority queue is an ADT where: –Each element has an associated priority –Efficient extraction of the highest-priority element is supported.
3-Sorting-Intro-Heapsort1 Sorting Dan Barrish-Flood.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2007 Heap Lecture Chapter 6 Use NOTES feature to see explanation.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
David Luebke 1 7/2/2015 Merge Sort Solving Recurrences The Master Theorem.
HEAPSORT COUNTING SORT RADIX SORT. HEAPSORT O(nlgn) worst case like Merge sort. Like Insertion Sort, but unlike Merge Sort, Heapsort sorts in place: Combines.
PQ, binary heaps G.Kamberova, Algorithms Priority Queue ADT Binary Heaps Gerda Kamberova Department of Computer Science Hofstra University.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Heap Lecture 2 Chapter 7 Wed. 10/10/01 Use NOTES feature to.
Heapsort CIS 606 Spring Overview Heapsort – O(n lg n) worst case—like merge sort. – Sorts in place—like insertion sort. – Combines the best of both.
2IL50 Data Structures Spring 2015 Lecture 3: Heaps.
Ch. 6: Heapsort n nodes. Heap -- Nearly binary tree of these n nodes (not just leaves) Heap property If max-heap, the max-heap property is that for every.
David Luebke 1 10/3/2015 CS 332: Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
Heaps, Heapsort, Priority Queues. Sorting So Far Heap: Data structure and associated algorithms, Not garbage collection context.
Binary Heap.
2IL50 Data Structures Fall 2015 Lecture 3: Heaps.
Chapter 21 Binary Heap.
Computer Sciences Department1. Sorting algorithm 3 Chapter 6 3Computer Sciences Department Sorting algorithm 1  insertion sort Sorting algorithm 2.
David Luebke 1 6/3/2016 CS 332: Algorithms Heapsort Priority Queues Quicksort.
Prepared by- Jatinder Paul, Shraddha Rumade
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Data Structure & Algorithm Lecture 5 Heap Sort & Binary Tree JJCAO.
CS 2133: Data Structures Quicksort. Review: Heaps l A heap is a “complete” binary tree, usually represented as an array:
Introduction to Algorithms
1 Analysis of Algorithms Chapter - 03 Sorting Algorithms.
1 Algorithms CSCI 235, Fall 2015 Lecture 14 Analysis of Heap Sort.
David Luebke 1 12/23/2015 Heaps & Priority Queues.
Computer Algorithms Lecture 9 Heapsort Ch. 6, App. B.5 Some of these slides are courtesy of D. Plaisted et al, UNC and M. Nicolescu, UNR.
S. Raskhodnikova and A. Smith. Based on slides by C. Leiserson and E. Demaine. 1 Adam Smith L ECTURES Priority Queues and Binary Heaps Algorithms.
CPSC 311 Section 502 Analysis of Algorithm Fall 2002 Department of Computer Science Texas A&M University.
Heapsort. What is a “heap”? Definitions of heap: 1.A large area of memory from which the programmer can allocate blocks as needed, and deallocate them.
CSC 413/513: Intro to Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
Chapter 6: Heapsort Combines the good qualities of insertion sort (sort in place) and merge sort (speed) Based on a data structure called a “binary heap”
Lecture 8 : Priority Queue Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University.
1 Heap Sort. A Heap is a Binary Tree Height of tree = longest path from root to leaf =  (lgn) A heap is a binary tree satisfying the heap condition:
David Luebke 1 2/5/2016 CS 332: Algorithms Introduction to heapsort.
Analysis of Algorithms CS 477/677 Lecture 8 Instructor: Monica Nicolescu.
COSC 3101A - Design and Analysis of Algorithms 3 Recurrences Master’s Method Heapsort and Priority Queue Many of the slides are taken from Monica Nicolescu’s.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Sept Heapsort What is a heap? Max-heap? Min-heap? Maintenance of Max-heaps -MaxHeapify -BuildMaxHeap Heapsort -Heapsort -Analysis Priority queues.
Heapsort Lecture 4 Asst. Prof. Dr. İlker Kocabaş.
6.Heapsort. Computer Theory Lab. Chapter 6P.2 Why sorting 1. Sometimes the need to sort information is inherent in a application. 2. Algorithms often.
1 Algorithms CSCI 235, Fall 2015 Lecture 13 Heap Sort.
"Teachers open the door, but you must enter by yourself. "
Heaps, Heapsort, and Priority Queues
Heaps, Heap Sort and Priority Queues
Heap Sort Example Qamar Abbas.
Introduction to Algorithms
CS 583 Analysis of Algorithms
Heaps, Heapsort, and Priority Queues
CS200: Algorithm Analysis
Ch 6: Heapsort Ming-Te Chi
Heapsort.
Heap Sort The Heap Data Structure
"Teachers open the door, but you must enter by yourself. "
Topic 5: Heap data structure heap sort Priority queue
HEAPS.
Heapsort Sorting in place
Solving Recurrences Continued The Master Theorem
Presentation transcript:

Sorting Algorithms (Part II) Slightly modified definition of the sorting problem: input: A collection of n data items where data item a i has a key, k i, drawn from a linearly ordered set (e.g., ints, chars) output: A permutation of the input sequence such that k 1  k 2 ...  k n In practice, one usually sorts ‘objects’ according to their key (the non-key data is called satellite data.) If the records are large, we may sort an array of pointers based on some key associated with each record.

Sorting Algorithms A sorting algorithm is comparison-based if the sorting operation depends on a pair-wise comparison of keys. A sorting algorithm is in-place if only a constant number of elements of the input array are ever stored outside the array. Running Time of Comparison-Based Sorting Algorithms Insertion Sort n 2 n 2 n yes Merge Sort nlgn nlgn nlgn no Heap Sort nlgn nlgn nlgn yes Quick Sort n 2 nlgn nlgn yes w-c a-c b-c in place?

Tree Terminology In a complete binary tree, every level, except possibly the last, is completely filled, and all bottom nodes are as far left as possible. Heaps are examples of complete binary trees. A full binary tree (sometimes called a proper binary tree) is a tree in which every node other than the leaves has two children. A perfect binary tree is a proper binary tree in which every level is filled. The height of a tree node i is the number of edges on the longest path from node i to a leaf. The level (depth) of a tree node starts with 0 for the root level and is incremented by 1 for each downward edge.

Heapsort (Ch. 6) heap concept Complete binary tree. Values in the nodes satisfy the “heap properties”. heap as an array implementation store heap as a binary tree in an array heapsize is number of elements in heap length is number of elements in array

Max-Heap Definition A max-heap is a binary tree with one key assigned to each node such that the tree meets the following 2 requirements: 1.Shape requirement: complete binary tree. 2.Parental dominance requirement: the key at each node is greater than or equal to the keys at its children.

Max-Heap In the array representation of a max-heap, the root of the tree is in A[1]. Given the index i of a node, Parent(i) LeftChild(i)RightChild(i) return (  i/2  ) return (2i) return (2i + 1) index keys n = 11 height = 3 A

Min-Heap Definition A min-heap is a binary tree with one key assigned to each node such that the tree meets the following 2 requirements: 1. Shape requirement: complete binary tree. 2. Child dominance requirement: the key at each node is less than or equal to the keys at its children.

Min-Heap Min-heaps are commonly used for priority queues in event-driven simulators. Parent(i) LeftChild(i)RightChild(i) return (  i/2  ) return (2i) return (2i + 1) index keys A

HeapSort HeapSort(A) 1. Build-Heap(A) /* put all elements in heap */ 2. for i = A.length downto 2 3. swap A[1] with A[i] /* puts max in ith array position */ 4. A.heap-size = A.heap-size Max-Heapify(A,1) /* restore heap property */ Input: An n-element array A[1…n]. Output: An n-element array A in sorted order, smallest to largest. Running time: Line 1:c 1 (?????) Need to know running time of Build-Heap Line 2: c 2 n Line 3:c 3 (n - 1) Line 4: c 4 (n - 1) Line 5:c 5 (n - 1) * (????) Need to know running time of Max-Heapify

Max-Heapify: Maintains the Max-Heap Property Max-Heapify(A, i) Assumption: sub-trees rooted at left and right children of A[i] are max- heaps... but sub-tree rooted at A[i] might not be a max-heap. Max-Heapify(A, i) will cause the value at A[i] to “move down” (if its key value is less than one of its children) in the heap so that sub-tree rooted at A[i] becomes a max-heap.

Max-Heapify: Maintains the Max-Heap Property Max-Heapify(A,i) 1. left = 2i; right = 2i + 1 /* indices of left & right children of A[i] */ 2. largest = i 3. if left  heapsize(A) and A[left] > A[i] 4. largest = left 5. if right  heapsize(A) and A[right] > A[largest] 6. largest = right 7. if largest  i 8. swap(A[i], A[largest]) 9. Max-Heapify(A, largest) How many calls does HeapSort make to Max-Heapify? How many swaps are made in lines 7, 8, and 9 (worst-case)? When does the recursion end and why? Input: An unordered array A[1…n] of comparable items and the index i of a node whose subtrees are max-heaps, but i might violate max-heap property. Output: A permutation of A such that i is the root of a max-heap

Max-Heapify: Running Time Running Time of Max-Heapify every line is  (1) time, except the recursive call in worst-case, last row of binary tree is half empty, so the sub-tree rooted at left child has size at most (2/3)n So we get the recurrence which, by case 2 of the master theorem, has the solution (or, Max-Heapify takes O(h) time when node A[i] has height h in the heap) A heap is a complete binary tree, hence must process O(lg n) levels, with constant work at each level.

Heapify: More pseudo-codish version of Max-Heapify Heapify(A,i) 1. if node i is not a leaf and A[i] < A[2i] and/or A[2i+1] 2. Let j be the maximum value child of i 3. Swap A[i] with A[j] 4. Heapify(A, j) Input: An unordered array A[1…n] of comparable items and the index i of a node whose subtrees are max-heaps, but i might violate max-heap property. Output: A permutation of A such that i is the root of a max-heap

Build-Heap Intuition : use Heapify in a bottom-up manner to convert A into a max-heap Build-Heap(A) 1.for i =A.length downto 2 // diff. in book 2. Heapify(A, i) Rough running Time of Build-Heap: About n calls to Heapify (O(n) calls) Simple upper bound: Each call to Heapify takes O(lgn) time  O(nlgn) time total. However, we can prove that Build-Heap runs in O(n) time, meaning that the O(nlgn) bound is correct but is not tight. Input: An unordered array A[1…n] of comparable items Output: A permutation of A such that A is a max-heap

Lemma 1: The number of nodes at level i in a perfect binary tree is 2 i Basis: level= = 1, so the lemma holds because there is only a root. Inductive Step: Assume true for level k and show true for level k + 1. By the IHOP, there are 2 k nodes at level k of a perfect binary tree. By the definition of a perfect binary tree, level k+1 must be full, meaning that every node at level k has 2 children. Therefore, the number of nodes at level k+1 = 2(2 k ) = 2 k+1.

Lemma 2: A perfect binary tree (pbt) of height h contains 2 h+1 -1 nodes. By Lemma 1, there are 2 k nodes at level k of a perfect binary tree. So the number of nodes in a perfect tree of height h is: This is a geometric series. Lemma 1: The number of nodes at level i in a perfect binary tree is 2 i

Notation you have seen before… Floor:  x  = the largest integer ≤ x Ceiling:  x  = the smallest integer ≥ x Geometric series: Harmonic series: Telescoping series:

Build-Heap - Tighter bound Build-Heap(A) 1.for i =A.length downto 2 2. Heapify(A, i) We use the following fact to prove the linear running time of Build-Heap: FACT 1: For any positive integer k, 2 k – 1 = 2 k k-2 + … Let T(n) denote the worst-case complexity of algorithm Build-Heap on any input of size n. We consider a perfect binary tree with height h = lg(n) and look at the total amount of work completed for all nodes at each height. At height 0 (level h) in the tree, no work is done since those nodes are leaves. At height 1 (level h-1), 2 h-1 nodes need to be “fixed” via a call to Heapify. This requires no more than O(1) com- parisons for each of these nodes.

Visual Proof of Build-Heap - Tighter bound At level h-2, O(2) units of work are required for all 2 h-2 nodes and so on. At level 0, O(h) units of work are required for a single node, the root. The amount of work done at level h-1 is denoted by the left-most blue rectangle in the figure below. 2 h-1 2 h-2 2 h-3 … = h

Visual Proof of Build-Heap - Tighter bound By rearranging the blue rectangles, Fact 1 implies that the entire second row can be aligned on top of the rectangle of width 2 h-1, leaving a gap of size one, shown by the darker rectangles below. Likewise, the entire third row can be aligned on top of the rectangle of width 2 h-2, leaving a gap of size one, and so on. The area of the light blue rectangles is O(n), since it is bounded by a rectangle of area because h = lg(n) and 2 lgn = n. This proof shows that the running time of Build-Heap is O(n). 2 h-1 2 h-2 2 h-3 … 1

Inserting Heap Elements Max-Heap-Insert(A, key) 1. heapsize(A) = heapsize(A) i = heapsize(A) 3. while i > 1 and A[parent(i)] < key 4. A[i] = A[parent(i)] 5. i = parent(i) 6. A[i] = key Assume A.length > heapsize (the array is large enough to expand the heap) Increment heapsize and insert new element in the highest numbered position of array. Assume items that were already in the heap form a max-heap. Walk up the tree from new leaf to root. Insert input key when a parent key larger than the input key is found Running time of Max-Heap-Insert: time to traverse leaf to root path = height = O(lgn)

Heapsort Time and Space Usage An array implementation of a heap uses O(n) space -one array element for each node in heap Heapsort is an in-place algorithm Running time is as good as Mergesort, O(nlgn) in worst case. Note that the running time of Heapsort and Max-Heapify dominate the O(n) running time of Build-Heap. Suppose we used a top-down approach to build the heap instead of the bottom-up Build-Heap. How would this algorithm work? What would the running time be?

Priority Queues A priority queue is a data structure for maintaining a set S of elements, each with an associated key such that the element with highest priority is always accessed quickly. A max-priority-queue gives priority to keys with larger values and supports the following operations: 1. insert(S, x) inserts the element x into set S. 2. max(S) returns element of S with largest key. 3. extract-max(S) removes and returns element of S with largest key. 4. increase-key(S,x,k) increases the value of element x's key to new value k (assuming k is at least as large as current key's value).

Priority Queues: Application for Heaps Initialize PQ by running Build-Heap on an array A. A[1] holds the maximum value after this step. Heap-Maximum(A) - returns value of A[1] but leaves heap as-is. Heap-Extract-Max(A) - Saves A[1] and then, like Heap-Sort, puts item in A[heapsize] at A[1], decrements heapsize, and uses Max-Heapify(A, 1) to restore heap property. Heap-Increase-Key(A, i, key) - If key is larger than key at parent(i), floats node with new key up heap until heap property is restored. An application of max-priority queues is to schedule jobs on a shared processor. Need to be able to check current job's priority Heap-Maximum(A) remove job from the queue Heap-Extract-Max(A) insert new jobs into queue Max-Heap-Insert(A, key) increase priority of jobsHeap-Increase-Key(A,i,key)

Min-Heap Application An application for a min-heap priority queue is an event-driven simulator, where the key is an integer representing the number of seconds (or other discrete time unit) from time zero (starting point for simulation).

End of lecture 6