1 Heaps (Priority Queues) You are given a set of items A[1..N] We want to find only the smallest or largest (highest priority) item quickly. Examples: –Operating system needs to schedule jobs according to priority –Doctors in ER take patients according to severity of injuries –Event simulation (bank customers arriving and departing, ordered according to arrival time) We want a Data Structure that can efficiently perform: –FindMin (or DeleteMin) –Insert
2 Binary Heaps (a.k.a. Priority Queues) A binary heap is a binary tree that is: –1. Complete: the tree is completely filled except possibly the bottom level, which is filled from left to right –2. Satisfies the heap property: The key stored in every node is smaller than (or equal to) the keys stored in its children Therefore, the root node always contains the smallest key in a heap Which of these is not a heap?
3 Array Implementation of Heaps Since heaps are complete binary trees, we can avoid pointers and use an array as follows: N = Root node = H[1] Children of H[i] = H[2i], H[2i + 1] Keep track of current size N (number of nodes) H:
4 FindMin FindMin(H): Easy! Return root value H[1] Running time = O(1)
5 DeleteMin – First Try DeleteMin: Delete (and return) value at root node We now have a “Hole” at the root Need to fill the hole with another value Replace with smallest child? Try replacing 2 with smallest child and that node with its smallest child, and so on…what happens? The heap property is still satisfied in the final tree, i.e., the key stored in every node is smaller than the keys stored in its children BUT, the resulting tree is NOT a complete binary tree! After DeleteMin
6 DeleteMin – Second Try DeleteMin: Delete (and return) value at root node We now have a “Hole” at the root Need to fill the hole with another value Since heap is smaller by one node, we need to empty the last slot Steps: Move last item to top; decrease size by 1 Push down (Heapify) the top item to its correct position in the heap N = 11
7 DeleteMin – Heapify Keep comparing with children H[2i] and H[2i + 1] Replace with smaller child and go down one level Done if both children are >= item or reached a leaf node What is the run time?
8 Running Time Analysis of DeleteMin Running time is O(height of tree) What is the height of a complete binary tree of N nodes? –O(log 2 (N))
9 DeleteMin(H, N) DeleteMin(H, N) --- Returns the minimum key if (N <= 0) return “error”; // Heap is empty. So return an error code! minKey = H[1]; // Save minKey H[1] = H[N]; // Move the last key to the root N = N – 1; // Decrease the # of nodes in the heap if (N <= 0) return minKey; // Empty heap after deletion? node = 1; // Start from the root and push the key down. while (1){ left = 2*node; // left child right = 2*node+1; // right child smallest = node; // Assume the current node has the smaller key if (left <= N and H[left] < H[smallest]) smallest = left; if (right <= N and H[right] < H[smallest]) smallest = right; if (smallest == node) return minKey; // We are done tmp = H[node]; H[node] = H[smallest]; H[smallest] = tmp; // Exchange node = smallest; // Move one level down and repeat } //end-while
10 Insertion into a Heap N = 10 How would we insert a key, say 1, to this heap?
11 Insertion to a Heap: Push the key up N = Increase the size of the heap by 1. Insert the new key in the last location. This preserves the complete tree property Now, push the key up to restore the heap property
Insert at last node and keep comparing with parent H[i/2] If parent is larger, replace with parent and go up one level Done if Key of the parent <= item or Reached top node H[1] Running time? O(height of tree) = O(log 2 (N)) Insertion to a Heap: Push the key up After 2 steps
13 InsertKey(H, N, key) InsertKey(H, key, N) -- Assumes the array has enough -- room to hold the key N = N + 1; // Increase the # of nodes in the heap H[N] = key; // Insert the key node = N; // Start from the last node and push the key up. while (1){ parent = node/2; // parent of the node if (parent < 1) return; // Already at the root? then done. if (H[parent] < H[node]) return; // Parent key is smaller? then done tmp = H[node]; // Exchange keys with the parent H[node] = H[parent]; H[parent] = tmp; node = parent; // Move one level up and repeat } //end-while
14 Insertion to a Heap: Using Sentinel Every iteration of Insert needs to test: 1.if it has reached the top node H[1] 2.if parent <= key Can avoid first test if H[0] contains a very large negative value (denoted by ) Then, test #2 always stops at top because < key for all keys Such a data value that serves as a marker is called a sentinel Used to improve efficiency and simplify code N =
15 Heap Space Analysis Consider a heap of N nodes Space needed: O(N) –Actually, O(MaxSize) where MaxSize = size of the array –One more variable to store the current size N –With sentinel Array-based implementation uses total N+2 space Pointer-based implementation: –pointers for children and parent –Space for the key –Total space =N*(Space for one key) + 3N + 1 (3 pointers per node + 1 for size)
16 Heap Ops Running Time Analysis Consider a heap of N nodes FindMin: O(1) time DeleteMin and Insert: O(log N) time BuildHeap from N inputs: What is the running time? –Start with an empty heap –Insert each element N Insert operations = O(N log N). Can we do better?
17 Building a Heap Bottom Up Treat input array as a heap and fix it using Heapify for i = N/2 to 1 do –Heapify(i) // Push the parent key down if // necessary Why N/2? –Nodes after N/2 are leaves! The above algorithm builds a heap in O(N) time!
Building a Heap Bottom Up: Example
19 One more Operation: DecreaseKey DecreaseKey(H, P,Delta, N): Decrease the key value of node at position P by a positive amount “Delta” within heap H with N nodes E.g. System administrators can increase priority of important jobs. How? First, subtract “Delta” from current value at P Heap property may be violated Push the new key up or down? UP Running time: O(log 2 N) After DecreaseKey(H, 4, 6)
20 One more Operation: IncreaseKey IncreaseKey(H, P,Delta, N): Increase the key value of node at position P by a positive amount “Delta” within heap H with N nodes E.g. Schedulers in OS often decrease priority of CPU hogging jobs How? First, add “Delta” to current value at P Heap property may be violated Push the new key up or down? DOWN Running time: O(log 2 N) After IncreaseKey(H, 2, 6)
21 One more Operation: DeleteKey DeleteKey(H, P, N): Delete the node at position P within heap H with N nodes E.g. Delete a job waiting in queue that has been preemptively terminated by user (you pressed Ctrl-C) How? First bring the key to the root by doing a DecreaseKey(H, P,, N) Then delete the min key using DeleteMin(H, N) After DecreaseKey(H, 2,, 11) After DeleteMin(H, 11) - -
22 Last Operation: Merge Merge(H1,H2): Merge two heaps H1 and H2 of size O(N). H1 and H2 are stored in two different arrays. E.g. Combine queues from two different sources to run on one CPU. 1.Can do O(N) Insert operations: Running Time: O(N log N) 2.Better: Copy H2 at the end of H1 and use BuildHeap Running Time: O(N) 3.Can we do better (i.e. Merge in O(log N) time?) Yes. Binomial Heaps, Fibonacci Heaps Will not be covered in this class
23 Summary of Heaps (Priority Queues) Complete binary trees satisfying the heap property Common implementation is a Binary Heap in an array FindMin is O(1) Insert and DeleteMin are O(log N) Merging is inefficient for binary heaps (O(N) time) Pointer-based alternatives such as Binomial Heaps allow merging in O(log N) time – Not covered in class Heaps (priority queues) are used in applications (such as job schedulers in OS) where repeated searches are made to find and delete the minimum (highest priority) items
24 Using a Heap for Sorting Main Idea: Build a max-heap Do N DeleteMax operations and store each Max element in the unused end of array Build Heap DeleteMax Initial Array Build Max-Heap DeleteMax Largest element in correct place after 1 st DeleteMax DeleteMax
25 Heapsort Analysis Heapsort is in-place…is it also stable? Exam Question? Running time? Time needed for building max-heap + time for N DeleteMax operations = = O(N) + O(N LogN) = O(N LogN) Can also show that running time is (N log N) for some inputs, so worst case is (N log N)