1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs.

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Advertisements

1 Finger search trees. 2 Goal Keep sorted lists subject to the following operations: find(x,L) insert(x,L) delete(x,L) catenate(L1,L2) : Assumes that.
Order Statistics(Selection Problem) A more interesting problem is selection:  finding the i th smallest element of a set We will show: –A practical randomized.
CS 3343: Analysis of Algorithms Lecture 14: Order Statistics.
1 Selection --Medians and Order Statistics (Chap. 9) The ith order statistic of n elements S={a 1, a 2,…, a n } : ith smallest elements Also called selection.
Introduction to Algorithms
Introduction to Algorithms Jiafen Liu Sept
CS 473Lecture X1 CS473-Algorithms I Lecture X Augmenting Data Structures.
CSE 2331/5331 Topic 10: Balanced search trees Rotate operation Red-black tree Augmenting data struct.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 10.
Median/Order Statistics Algorithms
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 11.
Binary Search Trees Briana B. Morrison Adapted from Alan Eugenio.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu.
© 2004 Goodrich, Tamassia Binary Search Trees   
Median, order statistics. Problem Find the i-th smallest of n elements.  i=1: minimum  i=n: maximum  i= or i= : median Sol: sort and index the i-th.
Selection: Find the ith number
Tirgul 4 Order Statistics Heaps minimum/maximum Selection Overview
David Luebke 1 7/2/2015 Medians and Order Statistics Structures for Dynamic Sets.
1 Trees 3: The Binary Search Tree Section Binary Search Tree A binary tree B is called a binary search tree iff: –There is an order relation
Ch. 8 & 9 – Linear Sorting and Order Statistics What do you trade for speed?
Order Statistics The ith order statistic in a set of n elements is the ith smallest element The minimum is thus the 1st order statistic The maximum is.
CS Data Structures Chapter 10 Search Structures.
BINARY SEARCH TREE. Binary Trees A binary tree is a tree in which no node can have more than two children. In this case we can keep direct links to the.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
Sorting Fun1 Chapter 4: Sorting     29  9.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
2IL50 Data Structures Fall 2015 Lecture 7: Binary Search Trees.
Chapter 9: Selection Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Order Statistics ● The ith order statistic in a set of n elements is the ith smallest element ● The minimum is thus the 1st order statistic ● The maximum.
1 Trees 4: AVL Trees Section 4.4. Motivation When building a binary search tree, what type of trees would we like? Example: 3, 5, 8, 20, 18, 13, 22 2.
Outline Binary Trees Binary Search Tree Treaps. Binary Trees The empty set (null) is a binary tree A single node is a binary tree A node has a left child.
1 Searching Searching in a sorted linked list takes linear time in the worst and average case. Searching in a sorted array takes logarithmic time in the.
CS 361 – Chapters 8-9 Sorting algorithms –Selection, insertion, bubble, “swap” –Merge, quick, stooge –Counting, bucket, radix How to select the n-th largest/smallest.
Preview  Graph  Tree Binary Tree Binary Search Tree Binary Search Tree Property Binary Search Tree functions  In-order walk  Pre-order walk  Post-order.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
Sorting What makes it hard? Chapter 7 in DS&AA Chapter 8 in DS&PS.
Order Statistics(Selection Problem)
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 7.
1 Medians and Order Statistics CLRS Chapter 9. upper median lower median The lower median is the -th order statistic The upper median.
Algorithms 2005 Ramesh Hariharan. Divide and Conquer+Recursion Compact and Precise Algorithm Description.
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
COSC 3101A - Design and Analysis of Algorithms 4 Quicksort Medians and Order Statistics Many of these slides are taken from Monica Nicolescu, Univ. of.
Binary Search Trees.  Understand tree terminology  Understand and implement tree traversals  Define the binary search tree property  Implement binary.
Binary Search Trees1 Chapter 3, Sections 1 and 2: Binary Search Trees AVL Trees   
CSC317 1 Quicksort on average run time We’ll prove that average run time with random pivots for any input array is O(n log n) Randomness is in choosing.
David Luebke 1 6/26/2016 CS 332: Algorithms Linear-Time Sorting Continued Medians and Order Statistics.
David Luebke 1 7/2/2016 CS 332: Algorithms Linear-Time Sorting: Review + Bucket Sort Medians and Order Statistics.
Chapter 9: Selection of Order Statistics What are an order statistic? min, max median, i th smallest, etc. Selection means finding a particular order statistic.
Interval Trees Store intervals of the form [li,ri], li <= ri.
Finger search trees.
Randomized Algorithms
Sorting We have actually seen already two efficient ways to sort:
Order Statistics(Selection Problem)
Augmenting Data Structures
Sorting We have actually seen already two efficient ways to sort:
Randomized Algorithms
Medians and Order Statistics
CS 3343: Analysis of Algorithms
CSE2331/5331 Topic 7: Balanced search trees Rotate operation
Red Black Trees (Guibas Sedgewick 78)
(2,4) Trees /6/ :26 AM (2,4) Trees (2,4) Trees
The Selection Problem.
Design and Analysis of Algorithms
CS200: Algorithm Analysis
Binary Search Trees < > = Dictionaries
Sorting We have actually seen already two efficient ways to sort:
Analysis of Algorithms CS 477/677
Presentation transcript:

1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs a i,a j such that i a j

2 Example 1, 4, 5, 8, 3 I=3 8, 7, 5, 3, 1 I=10

3 Think of “insertion sort” using a list When we insert the next item a k, how deep it gets into the list? As the number of inversions a i,a k for i < k lets call this I k

4 Analysis The running time is:

5 Thoughts When I=Ω(n 2 ) the running time is Ω(n 2 ) But we would like it to be O(nlog(n)) for any input, and faster whan I is small

6 Finger red black trees

7 Finger tree Take a regular search tree and reverse the direction of the pointers on the rightmost spine We go up from the last leaf until we find the subtree containing the item and we descend into it

8 Finger trees Say we search for a position at distance d from the end Then we go up to height O(log(d)) Insertions and deletions still take O(log n) worst case time but O(log(d)) amortized time So search for the d th position takes O(log(d)) time

9 Back to sorting Suppose we implement the insertion sort using a finger search tree When we insert item k then d=O(I k ) and it take O(log(I k )) time

10 Analysis The running time is: Since ∑ I j = I this is at most

11 Selection Find the k th element

12 Randomized selection Randomized-select (A, p, r,k) if p=r then return A[p] q ← randomized-partition(A,p,r) j ← q-p+1 if j=k then return A[q] else if k < j then return randomized-select(A,p,q-1,k) else return randomized-select(A,q+1,r,k-j)

13 X k = 1 iff A[p,q] contains exactly k elements

14 Expected running time With probability 1/n, A[p,q] contains exactly k elements, for k=1,2,…,n

15 Assume n is even

16 In general

17 Solve by “substitution” Assume T(k) ≤ ck for k < n, and prove T(n) ≤ cn

18 Solve by “substitution”

19 Choose c ≥4a

20 Expected # of comparisons Let z 1,z 2,.....,z n the elements in sorted order Let X ij = 1 if z i is compared to z j and 0 otherwise So,

21 by linearity of expectation

22 by linearity of expectation

23 Consider z i,z i+1, ,z j ≡ Z ij Claim: For i≤k and j>k then Pr{z i compared to z j } = 2/(j-i+1) Otherwise we have to pick z i or z j first among k,…,j so Pr{z i compared to z j } = 2/(j-k+1) if i > k, and Pr{z i compared to z j } = 2/(k-i+1) if j ≤ k

24 In the first double sum we have at most m terms of the form 2/m  so it is O(n) Similarly for the other two double sums

25 Selection in linear worst case time Blum, Floyd, Pratt, Rivest, and Tarjan (1973)

26 5-tuples

27 Sort the tuples

28 Recursively find the median of the medians

29 Recursively find the median of the medians

30 Recursively find the median of the medians

31 Recursively find the median of the medians

32 Partition around the median of the medians 5 Continue recursively with the side that contains the k th element

33 Neither side can be large 5 ≤ ¾n

34 The reason ≥

35 The reason ≤

36 Analysis

37 Order statistics, a dynamic version rank and select

38 The dictionary ADT Insert(x,D) Delete(x,D) Find(x,D): Returns a pointer to x if x ∊ D, and a pointer to the successor or predecessor of x if x is not in D

39 Suppose we want to add to the dictionary ADT Select(k,D): Returns the k th element in the dictionary: An element x such that k-1 elements are smaller than x

40 Select(5,D)

41 Select(5,D)

Can we still use a red-black tree ?

43 For each node v store # of leaves in the subtree of v

44 Select(7,T)

45 Select(7,T) Select(3, )

46 Select(7,T) Select(3, )

47 Select(1,) Select(7,T)

48 Select(i,T) Select(i,T): Select(i,root(T)) Select(k,v): if k = 1 then return v.left if k = 2 then return v.right if k ≤ (v.left).size then return Select(k,v.left) else return Select(k – (v.left).size),v.right) O(logn) worst case time

49 Rank(x,T) Return the index of x in T

50 Rank(x,T) x Need to return 9

x Sum up the sizes of the subtrees to the left of the path

52 Rank(x,T) Write the p-code

53 Insertion and deletions Consider insertion, deletion is similar

54 Insert

55 Insert (cont)

56 Easy to maintain through rotations x y B C y A x B C A size(x) ← size(B) + size(C) size(y) ← size(A) + size(x)

57 Summary Insertion and deletion and other dictionary operations still take O(log n) time