Download presentation
Presentation is loading. Please wait.
Published byÅdne Martinsen Modified over 5 years ago
1
Sorting We have actually seen already two efficient ways to sort:
2
A kind of “insertion” sort
Insert the elements into a red-black tree one by one Traverse the tree in in-order and collect the keys Takes O(nlog(n)) time
3
Heapsort (Willians, Floyd, 1964)
Put the elements in an array Make the array into a heap Do a deletemin and put the deleted element at the last position of the array
4
Put the elements in the heap
79 65 26 24 19 15 29 23 33 40 7 Q 79 65 26 24 19 15 29 23 33 40 7
5
Make the elements into a heap
79 65 26 24 19 15 29 23 33 40 7 Q 79 65 26 24 19 15 29 23 33 40 7
6
Make the elements into a heap
Heapify-down(Q,4) 79 65 26 24 19 15 29 23 33 40 7 Q 79 65 26 24 19 15 29 23 33 40 7
7
Heapify-down(Q,4) 79 65 26 24 7 15 29 23 33 40 19 Q 79 65 26 24 7 15 29 23 33 40 19
8
Heapify-down(Q,3) 79 65 26 24 7 15 29 23 33 40 19 Q 79 65 26 24 7 15 29 23 33 40 19
9
Heapify-down(Q,3) 79 65 26 23 7 15 29 24 33 40 19 Q 79 65 26 23 7 15 29 24 33 40 19
10
Heapify-down(Q,2) 79 65 26 23 7 15 29 24 33 40 19 Q 79 65 26 23 7 15 29 24 33 40 19
11
Heapify-down(Q,2) 79 65 15 23 7 26 29 24 33 40 19 Q 79 65 15 23 7 26 29 24 33 40 19
12
Heapify-down(Q,1) 79 65 15 23 7 26 29 24 33 40 19 Q 79 65 15 23 7 26 29 24 33 40 19
13
Heapify-down(Q,1) 79 7 15 23 65 26 29 24 33 40 19 Q 79 7 15 23 65 26 29 24 33 40 19
14
Heapify-down(Q,1) 79 7 15 23 19 26 29 24 33 40 65 Q 79 7 15 23 19 26 29 24 33 40 65
15
Heapify-down(Q,0) 79 7 15 23 19 26 29 24 33 40 65 Q 79 7 15 23 19 26 29 24 33 40 65
16
Heapify-down(Q,0) 7 79 15 23 19 26 29 24 33 40 65 Q 7 79 15 23 19 26 29 24 33 40 65
17
Heapify-down(Q,0) 7 19 15 23 79 26 29 24 33 40 65 Q 7 19 15 23 79 26 29 24 33 40 65
18
Heapify-down(Q,0) 7 19 15 23 40 26 29 24 33 79 65 Q 7 19 15 23 40 26 29 24 33 79 65
19
Summery We can build the heap in linear time (we already did this analysis) We still have to deletemin the elements one by one in order to sort that will take O(nlog(n))
20
Quicksort (Hoare 1961)
21
quicksort Input: an array A[p, r] Quicksort (A, p, r) if (p < r)
then q = Partition (A, p, r) //q is the position of the pivot element Quicksort (A, p, q-1) Quicksort (A, q+1, r)
22
p r i j 2 8 7 1 3 5 6 4 2 8 7 1 3 5 6 4 i j 2 8 7 1 3 5 6 4 i j 2 8 7 1 3 5 6 4 i j 2 1 7 8 3 5 6 4 i j
23
2 1 7 8 3 5 6 4 i j 2 1 3 8 7 5 6 4 i j 2 1 3 8 7 5 6 4 i j 2 1 3 8 7 5 6 4 i j 2 1 3 4 7 5 6 8 i j
24
2 8 7 1 3 5 6 4 p r Partition(A, p, r) x ←A[r] i ← p-1
for j ← p to r-1 do if A[j] ≤ x then i ← i+1 exchange A[i] ↔ A[j] exchange A[i+1] ↔A[r] return i+1
25
Analysis Running time is proportional to the number of comparisons
Each pair is compared at most once O(n2) In fact for each n there is an input of size n on which quicksort takes cn2 Ω(n2)
26
But Assume that the split is even in each iteration
27
T(n) = 2T(n/2) + bn How do we solve linear recurrences like this ? (read Chapter 4)
28
Recurrence tree bn T(n/2) T(n/2)
29
Recurrence tree bn bn/2 bn/2 T(n/4) T(n/4) T(n/4) T(n/4)
30
Recurrence tree bn bn/2 bn/2 logn T(n/4) T(n/4) T(n/4) T(n/4)
In every level we do bn comparisons So the total number of comparisons is O(nlogn)
31
Observations We can’t guarantee good splits
But intuitively on random inputs we will get good splits
32
Randomized quicksort Use randomized-partition rather than partition
Randomized-partition (A, p, r) i ← random(p,r) exchange A[r] ↔ A[i] return partition(A,p,r)
33
On the same input we will get a different running time in each run !
Look at the average for one particular input of all these running times
34
Expected # of comparisons
Let X be the expected # of comparisons This is a random variable Want to know E(X)
35
Expected # of comparisons
Let z1,z2,.....,zn the elements in sorted order Let Xij = 1 if zi is compared to zj and 0 otherwise So,
36
by linearity of expectation
37
by linearity of expectation
38
Consider zi,zi+1, ,zj ≡ Zij Claim: zi and zj are compared either zi or zj is the first chosen in Zij Proof: 3 cases: {zi, …, zj} Compared on this partition, and never again. {zi, …, zj} the same {zi, …, zk, …, zj} Not compared on this partition. Partition separates them, so no future partition uses both.
39
Pr{zi is compared to zj}
= Pr{zi or zj is first pivot chosen from Zij} just explained = Pr{zi is first pivot chosen from Zij} + Pr{zj is first pivot chosen from Zij} mutually exclusive possibilities = 1/(j-i+1) + 1/(j-i+1) = 2/(j-i+1)
40
Simplify with a change of variable, k=j-i+1.
Simplify and overestimate, by adding terms.
41
Lower bound for sorting in the comparison model
42
A lower bound Comparison model: We assume that the operation from which we deduce order among keys are comparisons Then we prove that we need Ω(nlogn) comparisons on the worst case
43
Model the algorithm as a decision tree
1 2 1 1 2 2 1 3 1 2 3 2 1 3
44
Important Observations
Every algorithm can be represented as a (binary) tree like this Each path corresponds to a run on some input The worst case # of comparisons corresponds to the longest path
45
The lower bound Let d be the length of the longest path n! ≤
#leaves ≤ 2d log2(n!) ≤d
46
Lower Bound for Sorting
Any sorting algorithm based on comparisons between elements requires (N log N) comparisons.
47
Beating the lower bound
We can beat the lower bound if we can deduce order relations between keys not by comparisons Examples: Count sort Radix sort
48
Linear time sorting Or assume something about the input: random, “almost sorted”
49
Sorting an almost sorted input
Suppose we know that the input is “almost” sorted Let I be the number of “inversions” in the input: The number of pairs ai,aj such that i<j and ai>aj
50
Example 1, 4 , 5 , 8 , 3 I=3 8, 7 , 5 , 3 , 1 I=10
51
Think of “insertion sort” using a list
When we insert the next item ak, how deep it gets into the list? As the number of inversions ai,ak for i < k lets call this Ik
52
Analysis The running time is:
53
Thoughts When I=Ω(n2) the running time is Ω(n2)
But we would like it to be O(nlog(n)) for any input, and faster whan I is small
54
Finger red black trees
55
Finger tree Take a regular search tree and reverse the direction of the pointers on the rightmost spine We go up from the last leaf until we find the subtree containing the item and we descend into it
56
Finger trees Say we search for a position at distance d from the end
Then we go up to height O(log(d)) So search for the dth position takes O(log(d)) time Insertions and deletions still take O(log n) worst case time but O(log(d)) amortized time
57
Back to sorting Suppose we implement the insertion sort using a finger search tree When we insert item k then d=O(Ik) and it take O(log(Ik)) time
58
Analysis The running time is: Since ∑Ij = I this is at most
59
Selection Find the kth element
60
Randomized selection Randomized-select (A, p, r,k)
if p=r then return A[p] q←randomized-partition(A,p,r) j ← q-p+1 if j=k then return A[q] else if k < j then return randomized-select(A,p,q-1,k) else return randomized-select(A,q+1,r,k-j)
61
Expected running time With probability 1/n, A[p,q] contains exactly k elements, for k=1,2,…,n
62
Assume n is even
63
In general
64
Solve by “substitution”
Assume T(k) ≤ ck for k < n, and prove T(n) ≤ cn
65
Solve by “substitution”
66
Choose c ≥4a
67
Selection in linear worst case time
Blum, Floyd, Pratt, Rivest, and Tarjan (1973)
68
5-tuples 6 2 9 5 1
69
Sort the tuples 9 6 5 2 1
70
Recursively find the median of the medians
9 6 5 2 1
71
Recursively find the median of the medians
9 6 5 7 10 1 3 2 11 2 1
72
Recursively find the median of the medians
9 6 5 7 10 1 3 2 11 2 1
73
Partition around the median of the medians
5 Continue recursively with the side that contains the kth element
74
Neither side can be large
5 ≤ ¾n ≤ ¾n
75
The reason ≥ 9 6 1 3 2 5 7 10 11 2 1
76
The reason 9 6 1 3 2 5 7 10 11 2 1 ≤
77
Analysis
78
Order statistics, a dynamic version
rank and select
79
The dictionary ADT Insert(x,D) Delete(x,D) Find(x,D):
Returns a pointer to x if x ∊ D, and a pointer to the successor or predecessor of x if x is not in D
80
Suppose we want to add to the dictionary ADT
Select(k,D): Returns the kth element in the dictionary: An element x such that k-1 elements are smaller than x
81
Select(5,D) 89 90 19 20 21 4 26 34 67 70 73 77
82
Select(5,D) 89 90 19 20 21 4 26 34 67 70 73 77
83
Can we still use a red-black tree ?
4 19 20 21 26 34 67 70 73 77 89 90
84
For each node v store # of leaves in the subtree of v
12 4 8 2 2 4 4 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90
85
Select(7,T) 12 4 8 2 2 4 4 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90
86
Select(7,T) 12 Select(3, ) 4 8 2 2 4 4 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90
87
Select(7,T) 12 4 8 Select(3, ) 2 2 4 4 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90
88
Select(7,T) 12 4 8 2 2 4 4 Select(1,) 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90
89
Select(i,T) Select(i,T): Select(i,root(T)) Select(k,v):
if k = 1 then return v.left if k = 2 then return v.right if k ≤ (v.left).size then return Select(k,v.left) else return Select(k – (v.left).size),v.right) O(logn) worst case time
90
Rank(x,T) Return the index of x in T
91
Rank(x,T) x Need to return 9
92
Sum up the sizes of the subtrees to the left of the path
12 4 8 2 2 4 4 2 2 2 2 4 19 20 21 26 34 67 70 73 77 89 90 x Sum up the sizes of the subtrees to the left of the path
93
Rank(x,T) Write the p-code
94
Insertion and deletions
Consider insertion, deletion is similar
95
Insert 12 8 4 2
96
Insert (cont) 13 9 5 3 2
97
Easy to maintain through rotations
x y <===> y C A x A B B C size(x) ← size(B) + size(C) size(y) ← size(A) + size(x)
98
Summary Insertion and deletion and other dictionary operations still take O(log n) time
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.