Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs.

Similar presentations


Presentation on theme: "1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs."— Presentation transcript:

1 1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs a i,a j such that i a j

2 2 Example 1, 4, 5, 8, 3 I=3 8, 7, 5, 3, 1 I=10

3 3 Think of “insertion sort” using a list When we insert the next item a k, how deep it gets into the list? As the number of inversions a i,a k for i < k lets call this I k

4 4 Analysis The running time is:

5 5 Thoughts When I=Ω(n 2 ) the running time is Ω(n 2 ) But we would like it to be O(nlog(n)) for any input, and faster whan I is small

6 6 Finger red black trees

7 7 Finger tree Take a regular search tree and reverse the direction of the pointers on the rightmost spine We go up from the last leaf until we find the subtree containing the item and we descend into it

8 8 Finger trees Say we search for a position at distance d from the end Then we go up to height O(log(d)) Insertions and deletions still take O(log n) worst case time but O(log(d)) amortized time So search for the d th position takes O(log(d)) time

9 9 Back to sorting Suppose we implement the insertion sort using a finger search tree When we insert item k then d=O(I k ) and it take O(log(I k )) time

10 10 Analysis The running time is: Since ∑ I j = I this is at most

11 11 Selection Find the k th element

12 12 Randomized selection Randomized-select (A, p, r,k) if p=r then return A[p] q ← randomized-partition(A,p,r) j ← q-p+1 if j=k then return A[q] else if k < j then return randomized-select(A,p,q-1,k) else return randomized-select(A,q+1,r,k-j)

13 13 X k = 1 iff A[p,q] contains exactly k elements

14 14 Expected running time With probability 1/n, A[p,q] contains exactly k elements, for k=1,2,…,n

15 15 Assume n is even

16 16 In general

17 17 Solve by “substitution” Assume T(k) ≤ ck for k < n, and prove T(n) ≤ cn

18 18 Solve by “substitution”

19 19 Choose c ≥4a

20 20 Expected # of comparisons Let z 1,z 2,.....,z n the elements in sorted order Let X ij = 1 if z i is compared to z j and 0 otherwise So,

21 21 by linearity of expectation

22 22 by linearity of expectation

23 23 Consider z i,z i+1,.......,z j ≡ Z ij Claim: For i≤k and j>k then Pr{z i compared to z j } = 2/(j-i+1) Otherwise we have to pick z i or z j first among k,…,j so Pr{z i compared to z j } = 2/(j-k+1) if i > k, and Pr{z i compared to z j } = 2/(k-i+1) if j ≤ k

24 24 In the first double sum we have at most m terms of the form 2/m  so it is O(n) Similarly for the other two double sums

25 25 Selection in linear worst case time Blum, Floyd, Pratt, Rivest, and Tarjan (1973)

26 26 5-tuples 6 2 9 5 1

27 27 Sort the tuples 9 6 5 2 1

28 28 Recursively find the median of the medians 9 6 5 2 1

29 29 Recursively find the median of the medians 9 6 5 2 1 7 10 13 2 11

30 30 Recursively find the median of the medians 9 6 5 2 1 7 10 13 2 11

31 31 Recursively find the median of the medians 1 3 2 9 6 5 2 1 7 10 11

32 32 Partition around the median of the medians 5 Continue recursively with the side that contains the k th element

33 33 Neither side can be large 5 ≤ ¾n

34 34 The reason 1 3 2 9 6 5 2 1 7 10 11 ≥

35 35 The reason 1 3 2 9 6 5 2 1 7 10 11 ≤

36 36 Analysis

37 37 Order statistics, a dynamic version rank and select

38 38 The dictionary ADT Insert(x,D) Delete(x,D) Find(x,D): Returns a pointer to x if x ∊ D, and a pointer to the successor or predecessor of x if x is not in D

39 39 Suppose we want to add to the dictionary ADT Select(k,D): Returns the k th element in the dictionary: An element x such that k-1 elements are smaller than x

40 40 Select(5,D) 9089 7773 70673426 212019 4

41 41 Select(5,D) 9089 7773 70673426 212019 4

42 42 9089777370673426 2120194 Can we still use a red-black tree ?

43 43 For each node v store # of leaves in the subtree of v 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12

44 44 Select(7,T) 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12

45 45 Select(7,T) 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12 Select(3, )

46 46 Select(7,T) 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12 Select(3, )

47 47 Select(1,) Select(7,T) 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12

48 48 Select(i,T) Select(i,T): Select(i,root(T)) Select(k,v): if k = 1 then return v.left if k = 2 then return v.right if k ≤ (v.left).size then return Select(k,v.left) else return Select(k – (v.left).size),v.right) O(logn) worst case time

49 49 Rank(x,T) Return the index of x in T

50 50 Rank(x,T) x Need to return 9

51 51 9089777370673426 2120194 2 2 4 2 2 4 4 2 2 8 12 x Sum up the sizes of the subtrees to the left of the path

52 52 Rank(x,T) Write the p-code

53 53 Insertion and deletions Consider insertion, deletion is similar

54 54 Insert 4 2 8 12

55 55 Insert (cont) 5 3 9 13 2

56 56 Easy to maintain through rotations x y B C y A x B C A size(x) ← size(B) + size(C) size(y) ← size(A) + size(x)

57 57 Summary Insertion and deletion and other dictionary operations still take O(log n) time


Download ppt "1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs."

Similar presentations


Ads by Google