David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets.

Slides:



Advertisements
Similar presentations
David Luebke 1 6/1/2014 CS 332: Algorithms Medians and Order Statistics Structures for Dynamic Sets.
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Jan Binary Search Trees What is a search binary tree? Inorder search of a binary search tree Find Min & Max Predecessor and successor BST insertion.
Binary Search Trees Many of the slides are from Prof. Plaisted’s resources at University of North Carolina at Chapel Hill.
Introduction to Algorithms Jiafen Liu Sept
CS 332: Algorithms Binary Search Trees. Review: Dynamic Sets ● Next few lectures will focus on data structures rather than straight algorithms ● In particular,
ALGORITHMS THIRD YEAR BANHA UNIVERSITY FACULTY OF COMPUTERS AND INFORMATIC Lecture six Dr. Hamdy M. Mousa.
The complexity and correctness of algorithms (with binary trees as an example)
UNC Chapel Hill Lin/Foskey/Manocha Binary Search Tree Her bir node u bir object olan bir linked data structure ile temsil edilebilir. Her bir node key,
David Luebke 1 5/4/2015 Binary Search Trees. David Luebke 2 5/4/2015 Dynamic Sets ● Want a data structure for dynamic sets ■ Elements have a key and satellite.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
Binary Search Trees Comp 550.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
Binary Search Trees CIS 606 Spring Search trees Data structures that support many dynamic-set operations. – Can be used as both a dictionary and.
Sorting. How fast can we sort? All the sorting algorithms we have seen so far are comparison sorts: only use comparisons to determine the relative order.
Universal Hashing When attempting to foil an malicious adversary, randomize the algorithm Universal hashing: pick a hash function randomly when the algorithm.
BST Data Structure A BST node contains: A BST contains
Lecture 11: Binary Search Trees Shang-Hua Teng. Data Format KeysEntryKeysSatellite data.
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
1.1 Data Structure and Algorithm Lecture 12 Binary Search Trees Topics Reference: Introduction to Algorithm by Cormen Chapter 13: Binary Search Trees.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE Binary search trees Motivation Operations on binary search trees: –Search –Minimum,
David Luebke 1 7/2/2015 ITCS 6114 Binary Search Trees.
David Luebke 1 7/2/2015 Merge Sort Solving Recurrences The Master Theorem.
David Luebke 1 7/2/2015 Linear-Time Sorting Algorithms.
David Luebke 1 8/17/2015 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 9/8/2015 CS 332: Algorithms Review for Exam 1.
CS 2133: Data Structures Binary Search Trees.
David Luebke 1 9/18/2015 CS 332: Algorithms Red-Black Trees.
David Luebke 1 10/3/2015 CS 332: Algorithms Solving Recurrences Continued The Master Theorem Introduction to heapsort.
Chapter 12. Binary Search Trees. Search Trees Data structures that support many dynamic-set operations. Can be used both as a dictionary and as a priority.
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 4: Data Structures.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
2IL50 Data Structures Fall 2015 Lecture 7: Binary Search Trees.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
Binary SearchTrees [CLRS] – Chap 12. What is a binary tree ? A binary tree is a linked data structure in which each node is an object that contains following.
CS 206 Introduction to Computer Science II 10 / 05 / 2009 Instructor: Michael Eckmann.
CS 206 Introduction to Computer Science II 02 / 13 / 2009 Instructor: Michael Eckmann.
October 3, Algorithms and Data Structures Lecture VII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
Preview  Graph  Tree Binary Tree Binary Search Tree Binary Search Tree Property Binary Search Tree functions  In-order walk  Pre-order walk  Post-order.
Lecture 9 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Order Statistics(Selection Problem)
1 Algorithms CSCI 235, Fall 2015 Lecture 22 Binary Search Trees.
Red-Black Trees. Review: Binary Search Trees ● Binary Search Trees (BSTs) are an important data structure for dynamic sets ● In addition to satellite.
Mudasser Naseer 1 1/25/2016 CS 332: Algorithms Lecture # 10 Medians and Order Statistics Structures for Dynamic Sets.
David Luebke 1 2/19/2016 Priority Queues Quicksort.
CS6045: Advanced Algorithms Data Structures. Dynamic Sets Next few lectures will focus on data structures rather than straight algorithms In particular,
CS6045: Advanced Algorithms Sorting Algorithms. Sorting So Far Insertion sort: –Easy to code –Fast on small inputs (less than ~50 elements) –Fast on nearly-sorted.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Many slides here are based on E. Demaine , D. Luebke slides
Binary Search Trees What is a binary search tree?
CS 3013 Binary Search Trees.
Binary Search Trees.
CS 332: Algorithms Red-Black Trees David Luebke /20/2018.
Linear-Time Sorting Continued Medians and Order Statistics
Randomized Algorithms
CS200: Algorithms Analysis
CS 3013 Binary Search Trees.
Lecture 7 Algorithm Analysis
Randomized Algorithms
Lecture 7 Algorithm Analysis
Algorithms and Data Structures Lecture VII
CS6045: Advanced Algorithms
Lecture 7 Algorithm Analysis
Binary SearchTrees [CLRS] – Chap 12.
Binary Search Trees Comp 122, Spring 2004.
Chapter 12&13: Binary Search Trees (BSTs)
Presentation transcript:

David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets

David Luebke 2 7/2/2015 The Selection Problem ● The selection problem: find the ith smallest element of a set ● Two algorithms: ■ A practical randomized algorithm with O(n) expected running time ■ A cool algorithm of theoretical interest only with O(n) worst-case running time

David Luebke 3 7/2/2015 Randomized Selection ● Key idea: use partition() from quicksort ■ But, only need to examine one subarray ■ This savings shows up in running time: O(n)  A[q]  A[q] q pr

David Luebke 4 7/2/2015 Randomized Selection RandomizedSelect(A, p, r, i) if (p == r) then return A[p]; q = RandomizedPartition(A, p, r) k = q - p + 1; if (i == k) then return A[q]; // not in book if (i < k) then return RandomizedSelect(A, p, q-1, i); else return RandomizedSelect(A, q+1, r, i-k);  A[q]  A[q] k q pr

David Luebke 5 7/2/2015 Randomized Selection ● Average case ■ For upper bound, assume ith element always falls in larger side of partition: ■ We then showed that T(n) = O(n) by substitution

David Luebke 6 7/2/2015 Worst-Case Linear-Time Selection ● Randomized algorithm works well in practice ● What follows is a worst-case linear time algorithm, really of theoretical interest only ● Basic idea: ■ Generate a good partitioning element ■ Call this element x

David Luebke 7 7/2/2015 Worst-Case Linear-Time Selection ● The algorithm in words: 1.Divide n elements into groups of 5 2.Find median of each group (How? How long?) 3.Use Select() recursively to find median x of the  n/5  medians 4.Partition the n elements around x. Let k = rank(x) 5.if (i == k) then return x if (i k) use Select() recursively to find (i-k)th smallest element in last partition

David Luebke 8 7/2/2015 Worst-Case Linear-Time Selection ● (Sketch situation on the board) ● How many of the 5-element medians are  x? ■ At least 1/2 of the medians =  n/5  / 2  =  n/10  ● How many elements are  x? ■ At least 3  n/10  elements ● For large n, 3  n/10   n/4 (How large?) ● So at least n/4 elements  x ● Similarly: at least n/4 elements  x

David Luebke 9 7/2/2015 Worst-Case Linear-Time Selection ● Thus after partitioning around x, step 5 will call Select() on at most 3n/4 elements ● The recurrence is therefore: ???  n/5   n/5 Substitute T(n) = cn Combine fractions Express in desired form What we set out to prove

David Luebke 10 7/2/2015 Worst-Case Linear-Time Selection ● Intuitively: ■ Work at each level is a constant fraction (19/20) smaller ○ Geometric progression! ■ Thus the O(n) work at the root dominates

David Luebke 11 7/2/2015 Linear-Time Median Selection ● Given a “black box” O(n) median algorithm, what can we do? ■ ith order statistic: ○ Find median x ○ Partition input around x ○ if (i  (n+1)/2) recursively find ith element of first half ○ else find (i - (n+1)/2)th element in second half ○ T(n) = T(n/2) + O(n) = O(n) ■ Can you think of an application to sorting?

David Luebke 12 7/2/2015 Linear-Time Median Selection ● Worst-case O(n lg n) quicksort ■ Find median x and partition around it ■ Recursively quicksort two halves ■ T(n) = 2T(n/2) + O(n) = O(n lg n)

David Luebke 13 7/2/2015 Dynamic Sets ● Next few lectures will focus on data structures rather than straight algorithms ● In particular, structures for dynamic sets ■ Elements have a key and satellite data ■ Dynamic sets support queries such as: ○ Search(S, k), Minimum(S), Maximum(S), Successor(S, x), Predecessor(S, x) ■ They may also support modifying operations like: ○ Insert(S, x), Delete(S, x)

David Luebke 14 7/2/2015 Binary Search Trees ● Binary Search Trees (BSTs) are an important data structure for dynamic sets ● In addition to satellite data, eleements have: ■ key: an identifying field inducing a total ordering ■ left: pointer to a left child (may be NULL) ■ right: pointer to a right child (may be NULL) ■ p: pointer to a parent node (NULL for root)

David Luebke 15 7/2/2015 Binary Search Trees ● BST property: key[left(x)]  key[x]  key[right(x)] ● Example: F BH KDA

David Luebke 16 7/2/2015 Inorder Tree Walk ● What does the following code do? TreeWalk(x) TreeWalk(left[x]); print(x); TreeWalk(right[x]); ● A: prints elements in sorted (increasing) order ● This is called an inorder tree walk ■ Preorder tree walk: print root, then left, then right ■ Postorder tree walk: print left, then right, then root

David Luebke 17 7/2/2015 Inorder Tree Walk ● Example: ● How long will a tree walk take? ● Prove that inorder walk prints in monotonically increasing order F BH KDA

David Luebke 18 7/2/2015 Operations on BSTs: Search ● Given a key and a pointer to a node, returns an element with that key or NULL: TreeSearch(x, k) if (x = NULL or k = key[x]) return x; if (k < key[x]) return TreeSearch(left[x], k); else return TreeSearch(right[x], k);

David Luebke 19 7/2/2015 BST Search: Example ● Search for D and C: F BH KDA

David Luebke 20 7/2/2015 Operations on BSTs: Search ● Here’s another function that does the same: TreeSearch(x, k) while (x != NULL and k != key[x]) if (k < key[x]) x = left[x]; else x = right[x]; return x; ● Which of these two functions is more efficient?

David Luebke 21 7/2/2015 Operations of BSTs: Insert ● Adds an element x to the tree so that the binary search tree property continues to hold ● The basic algorithm ■ Like the search procedure above ■ Insert x in place of NULL ■ Use a “trailing pointer” to keep track of where you came from (like inserting into singly linked list)

David Luebke 22 7/2/2015 BST Insert: Example ● Example: Insert C F BH KDA C

David Luebke 23 7/2/2015 BST Search/Insert: Running Time ● What is the running time of TreeSearch() or TreeInsert()? ● A: O(h), where h = height of tree ● What is the height of a binary search tree? ● A: worst case: h = O(n) when tree is just a linear string of left or right children ■ We’ll keep all analysis in terms of h for now ■ Later we’ll see how to maintain h = O(lg n)

David Luebke 24 7/2/2015 Sorting With Binary Search Trees ● Informal code for sorting array A of length n: BSTSort(A) for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root); ● Argue that this is  (n lg n) ● What will be the running time in the ■ Worst case? ■ Average case? (hint: remind you of anything?)

David Luebke 25 7/2/2015 Sorting With BSTs ● Average case analysis ■ It’s a form of quicksort! for i=1 to n TreeInsert(A[i]); InorderTreeWalk(root);

David Luebke 26 7/2/2015 Sorting with BSTs ● Same partitions are done as with quicksort, but in a different order ■ In previous example ○ Everything was compared to 3 once ○ Then those items < 3 were compared to 1 once ○ Etc. ■ Same comparisons as quicksort, different order! ○ Example: consider inserting 5

David Luebke 27 7/2/2015 Sorting with BSTs ● Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) ● Which do you think is better, quicksort or BSTsort? Why?

David Luebke 28 7/2/2015 Sorting with BSTs ● Since run time is proportional to the number of comparisons, same time as quicksort: O(n lg n) ● Which do you think is better, quicksort or BSTSort? Why? ● A: quicksort ■ Better constants ■ Sorts in place ■ Doesn’t need to build data structure

David Luebke 29 7/2/2015 More BST Operations ● BSTs are good for more than sorting. For example, can implement a priority queue ● What operations must a priority queue have? ■ Insert ■ Minimum ■ Extract-Min

David Luebke 30 7/2/2015 BST Operations: Minimum ● How can we implement a Minimum() query? ● What is the running time?

David Luebke 31 7/2/2015 BST Operations: Successor ● For deletion, we will need a Successor() operation ● Draw Fig 13.2 ● What is the successor of node 3? Node 15? Node 13? ● What are the general rules for finding the successor of node x? (hint: two cases)

David Luebke 32 7/2/2015 BST Operations: Successor ● Two cases: ■ x has a right subtree: successor is minimum node in right subtree ■ x has no right subtree: successor is first ancestor of x whose left child is also ancestor of x ○ Intuition: As long as you move to the left up the tree, you’re visiting smaller nodes. ● Predecessor: similar algorithm

David Luebke 33 7/2/2015 BST Operations: Delete ● Deletion is a bit tricky ● 3 cases: ■ x has no children: ○ Remove x ■ x has one child: ○ Splice out x ■ x has two children: ○ Swap x with successor ○ Perform case 1 or 2 to delete it F BH KDA C Example: delete K or H or B

David Luebke 34 7/2/2015 BST Operations: Delete ● Why will case 2 always go to case 0 or case 1? ● A: because when x has 2 children, its successor is the minimum in its right subtree ● Could we swap x with predecessor instead of successor? ● A: yes. Would it be a good idea? ● A: might be good to alternate