Sorting with Heaps Observation: Removal of the largest item from a heap can be performed in O(log n) time Another observation: Nodes are removed in order.

Slides:



Advertisements
Similar presentations
COL 106 Shweta Agrawal and Amit Kumar
Advertisements

Analysis of Algorithms
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
1 HeapSort CS 3358 Data Structures. 2 Heapsort: Basic Idea Problem: Arrange an array of items into sorted order. 1) Transform the array of items into.
Sorting Algorithms and Average Case Time Complexity
CMPT 225 Priority Queues and Heaps. Priority Queues Items in a priority queue have a priority The priority is usually numerical value Could be lowest.
Binary Heaps CSE 373 Data Structures Lecture 11. 2/5/03Binary Heaps - Lecture 112 Readings Reading ›Sections
Version TCSS 342, Winter 2006 Lecture Notes Priority Queues Heaps.
BST Data Structure A BST node contains: A BST contains
3-Sorting-Intro-Heapsort1 Sorting Dan Barrish-Flood.
Tirgul 7 Heaps & Priority Queues Reminder Examples Hash Tables Reminder Examples.
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
Priority Queues1 Part-D1 Priority Queues. Priority Queues2 Priority Queue ADT (§ 7.1.3) A priority queue stores a collection of entries Each entry is.
PQ, binary heaps G.Kamberova, Algorithms Priority Queue ADT Binary Heaps Gerda Kamberova Department of Computer Science Hofstra University.
Heaps and heapsort COMP171 Fall 2005 Part 2. Sorting III / Slide 2 Heap: array implementation Is it a good idea to store arbitrary.
Heapsort CIS 606 Spring Overview Heapsort – O(n lg n) worst case—like merge sort. – Sorts in place—like insertion sort. – Combines the best of both.
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
Compiled by: Dr. Mohammad Alhawarat BST, Priority Queue, Heaps - Heapsort CHAPTER 07.
Computer Science and Software Engineering University of Wisconsin - Platteville 12. Heap Yan Shi CS/SE 2630 Lecture Notes Partially adopted from C++ Plus.
1 Hash Tables  a hash table is an array of size Tsize  has index positions 0.. Tsize-1  two types of hash tables  open hash table  array element type.
9/17/20151 Chapter 12 - Heaps. 9/17/20152 Introduction ► Heaps are largely about priority queues. ► They are an alternative data structure to implementing.
Brought to you by Max (ICQ: TEL: ) February 5, 2005 Advanced Data Structures Introduction.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
The Binary Heap. Binary Heap Looks similar to a binary search tree BUT all the values stored in the subtree rooted at a node are greater than or equal.
Data Structure & Algorithm II.  Delete-min  Building a heap in O(n) time  Heap Sort.
September 29, Algorithms and Data Structures Lecture V Simonas Šaltenis Aalborg University
Heapsort By Pedro Oñate CS-146 Dr. Sin-Min Lee. Overview: Uses a heap as its data structure In-place sorting algorithm – memory efficient Time complexity.
Outline Priority Queues Binary Heaps Randomized Mergeable Heaps.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
1 Heaps (Priority Queues) You are given a set of items A[1..N] We want to find only the smallest or largest (highest priority) item quickly. Examples:
Chapter 2: Basic Data Structures. Spring 2003CS 3152 Basic Data Structures Stacks Queues Vectors, Linked Lists Trees (Including Balanced Trees) Priority.
COSC2007 Data Structures II Chapter 12 Tables & Priority Queues III.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
Sorting preparation for searching. Overview  levels of performance  categories of algorithms  Java class Arrays.
PRIORITY QUEUES AND HEAPS Slides of Ken Birman, Cornell University.
David Luebke 1 12/23/2015 Heaps & Priority Queues.
Week 13 - Friday.  What did we talk about last time?  Sorting  Insertion sort  Merge sort  Started quicksort.
Heapsort. What is a “heap”? Definitions of heap: 1.A large area of memory from which the programmer can allocate blocks as needed, and deallocate them.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
CSC 413/513: Intro to Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
Chapter 12 Heaps & HeapSort © John Urrutia 2014, All Rights Reserved1.
Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.
HEAPS. Review: what are the requirements of the abstract data type: priority queue? Quick removal of item with highest priority (highest or lowest key.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Internal and External Sorting External Searching
David Luebke 1 2/5/2016 CS 332: Algorithms Introduction to heapsort.
Priority Queues, Heaps, and Heapsort CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
1 Chapter 6 Heapsort. 2 About this lecture Introduce Heap – Shape Property and Heap Property – Heap Operations Heapsort: Use Heap to Sort Fixing heap.
Chapter 4, Part II Sorting Algorithms. 2 Heap Details A heap is a tree structure where for each subtree the value stored at the root is larger than all.
Week 13 - Wednesday.  What did we talk about last time?  NP-completeness.
1 the BSTree class  BSTreeNode has same structure as binary tree nodes  elements stored in a BSTree are a key- value pair  must be a class (or a struct)
Heaps, Heap Sort, and Priority Queues. Background: Binary Trees * Has a root at the topmost level * Each node has zero, one or two children * A node that.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Heaps and Heap Sort. Sorting III / Slide 2 Background: Complete Binary Trees * A complete binary tree is the tree n Where a node can have 0 (for the leaves)
"Teachers open the door, but you must enter by yourself. "
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
Heaps, Heapsort, and Priority Queues
Heap Sort Example Qamar Abbas.
Wednesday, April 18, 2018 Announcements… For Today…
Priority Queue & Heap CSCI 3110 Nan Chen.
Heaps, Heapsort, and Priority Queues
Heap Sort The idea: build a heap containing the elements to be sorted, then remove them in order. Let n be the size of the heap, and m be the number of.
Search Sorted Array: Binary Search Linked List: Linear Search
Tree Representation Heap.
Tables and Priority Queues
Priority Queues CSE 373 Data Structures.
B-Trees.
Heaps & Multi-way Search Trees
Presentation transcript:

Sorting with Heaps Observation: Removal of the largest item from a heap can be performed in O(log n) time Another observation: Nodes are removed in order Conclusion: Removing all of the nodes one by one would result in sorted output Analysis: Removal of all the nodes from a heap is a O(n*logn) operation

But … A heap can be used to return sorted data in O(n*log n) time However, we can’t assume that the data to be sorted just happens to be in a heap! Aha! But we can put it in a heap. Inserting an item into a heap is a O(log n) operation so inserting n items is O(n*log n) But we can do better than just repeatedly calling the insertion algorithm

Heapifying Data To create a heap from an unordered array repeatedly call bubbleDown bubbleDown ensures that the heap property is preserved from the start node down to the leaves it assumes that the only place where the heap property can be initially violated is the start node; i.e., left and right subtrees of the start node are heaps Call bubbleDown on the upper half of the array starting with index n/2-1 and working up to index 0 (which will be the root of the heap) bubbleDown does not need to be called on the lower half of the array (the leaves)

BubbleDown algorithm (part of deletion algorithm) public void bubbleDown(int i) // element at position i might not satisfy the heap property // but its subtrees do -> fix it { T item = items[i]; int current = i; // start at root while (left(current) < num_items) { // not a leaf // find a bigger child int child = left(current); if (right(current) < num_items && items[child].getKey() < items[right(current)].getKey()) { child = right(current); } if (item.getKey() < items[child].getKey()) { items[current] = items[child]; // move its value up current = child; } else break; } items[current] = item; }

Heapify Example Assume unsorted input is contained in an array as shown here (indexed from top to bottom and left to right)

Heapify Example n = 12, n-1/2 = 5 bubbleDown(5) bubbleDown(4) bubbleDown(3) bubbleDown(2) bubbleDown(1) bubbleDown(0) note: these changes are made in the underlying array

Heapify algorithm void heapify() { for (int i=num_items/2-1; i>=0; i--) bubbleDown(i); } Why is it enough to start at position num_items/2 – 1?

Cost to Heapify an Array bubbleDown is called on half the array The cost for bubbleDown is O(height) It would appear that heapify cost is O(n*logn) In fact the cost is O(n) The exact analysis is complex (and left for another course)

HeapSort Algorithm Sketch Heapify the array Repeatedly remove the root At the start of each removal swap the root with the last element in the tree The array is divided into a heap part and a sorted part At the end of the sort the array will be sorted (since we have max heap, we put the largest element to the end, etc.)

HeapSort assume BubbleDown is static and it takes the array on which it works and the number of elements of the heap as parameters public static void bubbleDown(KeyedItem ar[],int num_items,int i) // element at position i might not satisfy the heap property // but its subtrees do -> fix it heap sort: public static void HeapSort(KeyedItem ar[]) { // heapify - build heap out of ar int num_items = ar.length; for (int i=num_items/2-1; i>-0; i--) bubbleDown(ar,num_items,i); for (int i=0; i<ar.length-1; i++) {// do it n-1 times // extract the largest element from the heap swap(ar,0,num_items-1); num_items--; // fix the heap property bubbleDown(ar,num_items,0); }

HeapSort Notes The algorithm runs in O(n*log n) time Considerably more efficient than selection sort and insertion sort The same (O) efficiency as mergeSort and quickSort The sort is carried out “in-place” That is, it does not require that a copy of the array to be made (memory efficient!) – quickSort has a similar property, but not mergeSort

CMPT 225 Hash Tables

Is balanced BST efficient enough? What drives the need for hash tables given the existence of balanced binary search trees?: support relatively fast searches (O (log n)), insertion and deletion support range queries (i.e. return information about a range of records, e.g. “find the ages of all customers whose last name begins with ’S’”) are dynamic (i.e. the number of records to be stored is not fixed) But note the “relatively fast searches”. What if we want to make many single searches in a large amount of data? If a BST contains 1,000,000 items then each search requires around log 2 1,000,000 = 20 comparisons. If we had stored the data in an array and could (somehow) know the index (based on the value of the key) then each search would take constant (O(1)) time, a twenty-fold improvement.

Using arrays If the data have conveniently distributed keys that range from 0 to the some value N with no duplicates then we can use an array: An item with key K is stored in the array in the cell with index K. Perfect solution: searching, inserting, deleting in time O(1) Drawback: N is usually huge (sometimes even not bounded) – so it requires a lot of memory. Unfortunately this is often the case. Examples: we want to look people up by their phone numbers, or SINs, or names. Let’s look at these examples.

Using arrays – Example 1: phone numbers as keys For phone numbers we can assume that the values range between and (in Canada). So let’s see how big an array has to be to store all the possible numbers. It’s easy to map phone numbers to integers (keys), just get rid of the ‘-‘s. So we have a range from 0 to 9,999,999,999. So we’d need an array of size 10 billion. There are two problems here: The first is that you won’t fit the array in main memory. A PC with 2GB of RAM can store only 536,870,912 references (assuming each reference takes only 4 bytes) which is clearly insufficient. Plus we have to store actual data somewhere. (We could store the array on the hard drive, but it would require 40GB.) The other problem is that such an array would be horribly wasteful. The population of Canada estimated in July 2004 is 32,507,874, so if we assume that that’s the approx. number of phone numbers, there is a huge amount of wasted space.