1 Sorting an almost sorted input Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs a i,a j such that i a j
2 Example 1, 4, 5, 8, 3 I=3 8, 7, 5, 3, 1 I=10
3 Think of “insertion sort” using a list When we insert the next item a k, how deep it gets into the list? As the number of inversions a i,a k for i < k lets call this I k
4 Analysis The running time is:
5 Thoughts When I=Ω(n 2 ) the running time is Ω(n 2 ) But we would like it to be O(nlog(n)) for any input, and faster whan I is small
6 Finger red black trees
7 Finger tree Take a regular search tree and reverse the direction of the pointers on the rightmost spine We go up from the last leaf until we find the subtree containing the item and we descend into it
8 Finger trees Say we search for a position at distance d from the end Then we go up to height O(log(d)) Insertions and deletions still take O(log n) worst case time but O(log(d)) amortized time So search for the d th position takes O(log(d)) time
9 Back to sorting Suppose we implement the insertion sort using a finger search tree When we insert item k then d=O(I k ) and it take O(log(I k )) time
10 Analysis The running time is: Since ∑ I j = I this is at most
11 Selection Find the k th element
12 Randomized selection Randomized-select (A, p, r,k) if p=r then return A[p] q ← randomized-partition(A,p,r) j ← q-p+1 if j=k then return A[q] else if k < j then return randomized-select(A,p,q-1,k) else return randomized-select(A,q+1,r,k-j)
13 X k = 1 iff A[p,q] contains exactly k elements
14 Expected running time With probability 1/n, A[p,q] contains exactly k elements, for k=1,2,…,n
15 Assume n is even
16 In general
17 Solve by “substitution” Assume T(k) ≤ ck for k < n, and prove T(n) ≤ cn
18 Solve by “substitution”
19 Choose c ≥4a
20 Expected # of comparisons Let z 1,z 2,.....,z n the elements in sorted order Let X ij = 1 if z i is compared to z j and 0 otherwise So,
21 by linearity of expectation
22 by linearity of expectation
23 Consider z i,z i+1, ,z j ≡ Z ij Claim: For i≤k and j>k then Pr{z i compared to z j } = 2/(j-i+1) Otherwise we have to pick z i or z j first among k,…,j so Pr{z i compared to z j } = 2/(j-k+1) if i > k, and Pr{z i compared to z j } = 2/(k-i+1) if j ≤ k
24 In the first double sum we have at most m terms of the form 2/m so it is O(n) Similarly for the other two double sums
25 Selection in linear worst case time Blum, Floyd, Pratt, Rivest, and Tarjan (1973)
26 5-tuples
27 Sort the tuples
28 Recursively find the median of the medians
29 Recursively find the median of the medians
30 Recursively find the median of the medians
31 Recursively find the median of the medians
32 Partition around the median of the medians 5 Continue recursively with the side that contains the k th element
33 Neither side can be large 5 ≤ ¾n
34 The reason ≥
35 The reason ≤
36 Analysis
37 Order statistics, a dynamic version rank and select
38 The dictionary ADT Insert(x,D) Delete(x,D) Find(x,D): Returns a pointer to x if x ∊ D, and a pointer to the successor or predecessor of x if x is not in D
39 Suppose we want to add to the dictionary ADT Select(k,D): Returns the k th element in the dictionary: An element x such that k-1 elements are smaller than x
40 Select(5,D)
41 Select(5,D)
Can we still use a red-black tree ?
43 For each node v store # of leaves in the subtree of v
44 Select(7,T)
45 Select(7,T) Select(3, )
46 Select(7,T) Select(3, )
47 Select(1,) Select(7,T)
48 Select(i,T) Select(i,T): Select(i,root(T)) Select(k,v): if k = 1 then return v.left if k = 2 then return v.right if k ≤ (v.left).size then return Select(k,v.left) else return Select(k – (v.left).size),v.right) O(logn) worst case time
49 Rank(x,T) Return the index of x in T
50 Rank(x,T) x Need to return 9
x Sum up the sizes of the subtrees to the left of the path
52 Rank(x,T) Write the p-code
53 Insertion and deletions Consider insertion, deletion is similar
54 Insert
55 Insert (cont)
56 Easy to maintain through rotations x y B C y A x B C A size(x) ← size(B) + size(C) size(y) ← size(A) + size(x)
57 Summary Insertion and deletion and other dictionary operations still take O(log n) time