David Luebke 1 6/1/2014 CS 332: Algorithms Medians and Order Statistics Structures for Dynamic Sets.

Slides:



Advertisements
Similar presentations
2. Getting Started Heejin Park College of Information and Communications Hanyang University.
Advertisements

Introduction to Algorithms
October 17, 2005 Copyright© Erik D. Demaine and Charles E. Leiserson L2.1 Introduction to Algorithms 6.046J/18.401J LECTURE9 Randomly built binary.
Introduction to Algorithms 6.046J/18.401J
© 2001 by Charles E. Leiserson Introduction to AlgorithmsDay 18 L10.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 10 Prof. Erik Demaine.
© 2001 by Charles E. Leiserson Introduction to AlgorithmsDay 17 L9.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 9 Prof. Charles E. Leiserson.
©2001 by Charles E. Leiserson Introduction to AlgorithmsDay 9 L6.1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 6 Prof. Erik Demaine.
Comp 122, Spring 2004 Order Statistics. order - 2 Lin / Devi Comp 122 Order Statistic i th order statistic: i th smallest element of a set of n elements.
CS16: Introduction to Data Structures & Algorithms
Binary Tree Structure a b fe c a rightleft g g NIL c ef b left right pp p pp left key.
David Luebke 1 6/7/2014 CS 332: Algorithms Skip Lists Introduction to Hashing.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Abstract Data Types and Algorithms
Hash Tables.
David Luebke 1 8/25/2014 CS 332: Algorithms Red-Black Trees.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
10 -1 Chapter 10 Amortized Analysis A sequence of operations: OP 1, OP 2, … OP m OP i : several pops (from the stack) and one push (into the stack)
Chapter 12 Binary Search Trees
CSE Lecture 17 – Balanced trees
Rizwan Rehman Centre for Computer Studies Dibrugarh University
PSSA Preparation.
Foundations of Data Structures Practical Session #7 AVL Trees 2.
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.
Jan Binary Search Trees What is a search binary tree? Inorder search of a binary search tree Find Min & Max Predecessor and successor BST insertion.
CS 332: Algorithms Binary Search Trees. Review: Dynamic Sets ● Next few lectures will focus on data structures rather than straight algorithms ● In particular,
ALGORITHMS THIRD YEAR BANHA UNIVERSITY FACULTY OF COMPUTERS AND INFORMATIC Lecture six Dr. Hamdy M. Mousa.
David Luebke 1 5/4/2015 Binary Search Trees. David Luebke 2 5/4/2015 Dynamic Sets ● Want a data structure for dynamic sets ■ Elements have a key and satellite.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Binary Search Trees Comp 550.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Universal Hashing When attempting to foil an malicious adversary, randomize the algorithm Universal hashing: pick a hash function randomly when the algorithm.
BST Data Structure A BST node contains: A BST contains
1.1 Data Structure and Algorithm Lecture 12 Binary Search Trees Topics Reference: Introduction to Algorithm by Cormen Chapter 13: Binary Search Trees.
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
David Luebke 1 7/2/2015 ITCS 6114 Binary Search Trees.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
Sorting in Linear Time Lower bound for comparison-based sorting
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
CS 2133: Data Structures Binary Search Trees.
David Luebke 1 9/18/2015 CS 332: Algorithms Red-Black Trees.
Chapter 12. Binary Search Trees. Search Trees Data structures that support many dynamic-set operations. Can be used both as a dictionary and as a priority.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
David Luebke 1 10/13/2015 CS 332: Algorithms Linear-Time Sorting Algorithms.
CSC 41/513: Intro to Algorithms Linear-Time Sorting Algorithms.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Binary SearchTrees [CLRS] – Chap 12. What is a binary tree ? A binary tree is a linked data structure in which each node is an object that contains following.
Lecture 9 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
+ David Kauchak cs312 Review. + Midterm Will be posted online this afternoon You will have 2 hours to take it watch your time! if you get stuck on a problem,
Red-Black Trees. Review: Binary Search Trees ● Binary Search Trees (BSTs) are an important data structure for dynamic sets ● In addition to satellite.
Mudasser Naseer 1 1/25/2016 CS 332: Algorithms Lecture # 10 Medians and Order Statistics Structures for Dynamic Sets.
CS6045: Advanced Algorithms Data Structures. Dynamic Sets Next few lectures will focus on data structures rather than straight algorithms In particular,
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Many slides here are based on E. Demaine , D. Luebke slides
Binary Search Trees What is a binary search tree?
CS 3013 Binary Search Trees.
CS 332: Algorithms Red-Black Trees David Luebke /20/2018.
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
CS200: Algorithms Analysis
CS 3013 Binary Search Trees.
Randomized Algorithms
CS6045: Advanced Algorithms
Binary SearchTrees [CLRS] – Chap 12.
Presentation transcript:

David Luebke 1 6/1/2014 CS 332: Algorithms Medians and Order Statistics Structures for Dynamic Sets

David Luebke 2 6/1/2014 Homework 3 On the web shortly… Due Wednesday at the beginning of class (test)

David Luebke 3 6/1/2014 Review: Radix Sort Radix sort: Assumption: input has d digits ranging from 0 to k Basic idea: Sort elements by digit starting with least significant Use a stable sort (like counting sort) for each stage Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk) When d is constant and k=O(n), takes O(n) time Fast! Stable! Simple! Doesnt sort in place

David Luebke 4 6/1/2014 Review: Bucket Sort Bucket sort Assumption: input is n reals from [0, 1) Basic idea: Create n linked lists (buckets) to divide interval [0,1) into subintervals of size 1/n Add each input element to appropriate bucket and sort buckets with insertion sort Uniform input distribution O(1) bucket size Therefore the expected total time is O(n) These ideas will return when we study hash tables

David Luebke 5 6/1/2014 Review: Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is (duh) the nth order statistic The median is the n/2 order statistic If n is even, there are 2 medians Could calculate order statistics by sorting Time: O(n lg n) w/ comparison sort We can do better

David Luebke 6 6/1/2014 Review: The Selection Problem The selection problem: find the ith smallest element of a set Two algorithms: A practical randomized algorithm with O(n) expected running time A cool algorithm of theoretical interest only with O(n) worst-case running time

David Luebke 7 6/1/2014 Review: Randomized Selection Key idea: use partition() from quicksort But, only need to examine one subarray This savings shows up in running time: O(n) A[q] q pr

David Luebke 8 6/1/2014 Review: Randomized Selection RandomizedSelect(A, p, r, i) if (p == r) then return A[p]; q = RandomizedPartition(A, p, r) k = q - p + 1; if (i == k) then return A[q]; // not in book if (i < k) then return RandomizedSelect(A, p, q-1, i); else return RandomizedSelect(A, q+1, r, i-k); A[q] k q pr

David Luebke 9 6/1/2014 Review: Randomized Selection Average case For upper bound, assume ith element always falls in larger side of partition: We then showed that T(n) = O(n) by substitution

David Luebke 10 6/1/2014 Worst-Case Linear-Time Selection Randomized algorithm works well in practice What follows is a worst-case linear time algorithm, really of theoretical interest only Basic idea: Generate a good partitioning element Call this element x

David Luebke 11 6/1/2014 Worst-Case Linear-Time Selection The algorithm in words: 1.Divide n elements into groups of 5 2.Find median of each group (How? How long?) 3.Use Select() recursively to find median x of the n/5 medians 4.Partition the n elements around x. Let k = rank(x) 5.if (i == k) then return x if (i k) use Select() recursively to find (i-k)th smallest element in last partition

David Luebke 12 6/1/2014 Worst-Case Linear-Time Selection (Sketch situation on the board) How many of the 5-element medians are x? At least 1/2 of the medians = n/5 / 2 = n/10 How many elements are x? At least 3 n/10 elements For large n, 3 n/10 n/4 (How large?) So at least n/4 elements x Similarly: at least n/4 elements x

David Luebke 13 6/1/2014 Worst-Case Linear-Time Selection Thus after partitioning around x, step 5 will call Select() on at most 3n/4 elements The recurrence is therefore: ??? n/5 n/5 Substitute T(n) = cn Combine fractions Express in desired form What we set out to prove

David Luebke 14 6/1/2014 Worst-Case Linear-Time Selection Intuitively: Work at each level is a constant fraction (19/20) smaller Geometric progression! Thus the O(n) work at the root dominates

David Luebke 15 6/1/2014 Linear-Time Median Selection Given a black box O(n) median algorithm, what can we do? ith order statistic: Find median x Partition input around x if (i (n+1)/2) recursively find ith element of first half else find (i - (n+1)/2)th element in second half T(n) = T(n/2) + O(n) = O(n) Can you think of an application to sorting?

David Luebke 16 6/1/2014 Linear-Time Median Selection Worst-case O(n lg n) quicksort Find median x and partition around it Recursively quicksort two halves T(n) = 2T(n/2) + O(n) = O(n lg n)

David Luebke 17 6/1/2014 Structures… Done with sorting and order statistics for now Ahead of schedule, so… Next part of class will focus on data structures We will get a couple in before the first exam Yes, these will be on this exam

David Luebke 18 6/1/2014 Dynamic Sets Next few lectures will focus on data structures rather than straight algorithms In particular, structures for dynamic sets Elements have a key and satellite data Dynamic sets support queries such as: Search(S, k), Minimum(S), Maximum(S), Successor(S, x), Predecessor(S, x) They may also support modifying operations like: Insert(S, x), Delete(S, x)

David Luebke 19 6/1/2014 Binary Search Trees Binary Search Trees (BSTs) are an important data structure for dynamic sets In addition to satellite data, eleements have: key: an identifying field inducing a total ordering left: pointer to a left child (may be NULL) right: pointer to a right child (may be NULL) p: pointer to a parent node (NULL for root)

David Luebke 20 6/1/2014 Binary Search Trees BST property: key[left(x)] key[x] key[right(x)] Example: F BH KDA

David Luebke 21 6/1/2014 Inorder Tree Walk What does the following code do? TreeWalk(x) TreeWalk(left[x]); print(x); TreeWalk(right[x]); A: prints elements in sorted (increasing) order This is called an inorder tree walk Preorder tree walk: print root, then left, then right Postorder tree walk: print left, then right, then root

David Luebke 22 6/1/2014 Inorder Tree Walk Example: How long will a tree walk take? Prove that inorder walk prints in monotonically increasing order F BH KDA

David Luebke 23 6/1/2014 Operations on BSTs: Search Given a key and a pointer to a node, returns an element with that key or NULL: TreeSearch(x, k) if (x = NULL or k = key[x]) return x; if (k < key[x]) return TreeSearch(left[x], k); else return TreeSearch(right[x], k);

David Luebke 24 6/1/2014 BST Search: Example Search for D and C: F BH KDA

David Luebke 25 6/1/2014 Operations on BSTs: Search Heres another function that does the same: TreeSearch(x, k) while (x != NULL and k != key[x]) if (k < key[x]) x = left[x]; else x = right[x]; return x; Which of these two functions is more efficient?

David Luebke 26 6/1/2014 Operations of BSTs: Insert Adds an element x to the tree so that the binary search tree property continues to hold The basic algorithm Like the search procedure above Insert x in place of NULL Use a trailing pointer to keep track of where you came from (like inserting into singly linked list)

David Luebke 27 6/1/2014 BST Insert: Example Example: Insert C F BH KDA C

David Luebke 28 6/1/2014 BST Search/Insert: Running Time What is the running time of TreeSearch() or TreeInsert()? A: O(h), where h = height of tree What is the height of a binary search tree? A: worst case: h = O(n) when tree is just a linear string of left or right children Well keep all analysis in terms of h for now Later well see how to maintain h = O(lg n)

David Luebke 29 6/1/2014 Sorting With Binary Search Trees Informal code for sorting array A of length n: BSTSort(A) for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); Argue that this is (n lg n) What will be the running time in the Worst case? Average case? (hint: remind you of anything?)

David Luebke 30 6/1/2014 Sorting With BSTs Average case analysis Its a form of quicksort! for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root);

David Luebke 31 6/1/2014 Sorting with BSTs Same partitions are done as with quicksort, but in a different order In previous example Everything was compared to 3 once Then those items < 3 were compared to 1 once Etc. Same comparisons as quicksort, different order! Example: consider inserting 5

David Luebke 32 6/1/2014 Sorting with BSTs Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) Which do you think is better, quicksort or BSTsort? Why?

David Luebke 33 6/1/2014 Sorting with BSTs Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) Which do you think is better, quicksort or BSTSort? Why? A: quicksort Better constants Sorts in place Doesnt need to build data structure

David Luebke 34 6/1/2014 More BST Operations BSTs are good for more than sorting. For example, can implement a priority queue What operations must a priority queue have? Insert Minimum Extract-Min

David Luebke 35 6/1/2014 BST Operations: Minimum How can we implement a Minimum() query? What is the running time?

David Luebke 36 6/1/2014 BST Operations: Successor For deletion, we will need a Successor() operation Draw Fig 13.2 What is the successor of node 3? Node 15? Node 13? What are the general rules for finding the successor of node x? (hint: two cases)

David Luebke 37 6/1/2014 BST Operations: Successor Two cases: x has a right subtree: successor is minimum node in right subtree x has no right subtree: successor is first ancestor of x whose left child is also ancestor of x Intuition: As long as you move to the left up the tree, youre visiting smaller nodes. Predecessor: similar algorithm

David Luebke 38 6/1/2014 BST Operations: Delete Deletion is a bit tricky 3 cases: x has no children: Remove x x has one child: Splice out x x has two children: Swap x with successor Perform case 1 or 2 to delete it F BH KDA C Example: delete K or H or B

David Luebke 39 6/1/2014 BST Operations: Delete Why will case 2 always go to case 0 or case 1? A: because when x has 2 children, its successor is the minimum in its right subtree Could we swap x with predecessor instead of successor? A: yes. Would it be a good idea? A: might be good to alternate

David Luebke 40 6/1/2014 The End Up next: guaranteeing a O(lg n) height tree