Presentation on theme: "Medians and Order Statistics"— Presentation transcript:
1 Medians and Order Statistics i-th order statistic: i-th smallest elementn elements: median isn odd: (n+1)/2n even: n/2 or n/2+1Assume distinct numbers.Input: A, n, 1<=i<=nOutput: element x of A larger than i-1 elements of A.
2 Solutions O(n log n) time based on … O(n) time average. O(n) time worst case.
3 Minimum and Maximum How many comparisons? At most n-1.Examine each element and keep trach of smallest one:Comparison basedEach element must be comparedEach must loose once (except winner).What about simultaneous min and max?
4 Min & Max Can do with 2n-2 comparisons. Can do better Form pairs of elementsCompare elements in each pairPair (ai, ai+1), assume ai < ai+1, thenCompare (min,ai), (ai+1,max)3 comparisions for each pair.
5 Average Time Median Selection Divide-and-Conquer (prune-and-search).Randomized: behavior determined by output of random number generator.Based on QuickSort:Partition input array recursively, butWork only on one side!
6 Randomized Selection QuickSort(A,p,r) RandSelect(A,p,r,i) If p < r thenq=partition(A,p,r)QuickSort(A,p,q)QuickSort(A,q+1,r).First call: QuickSort(A,1,n)After partition(A,p,q):A[i]<A[q}, i<q;A[q]<A[j}, q<j.RandSelect(A,p,r,i)If p == r then return A[p]q=RandPartition(A,p,r)k=q-p+1 /* size of A[p..q]If i ≤ k then return RandSelect(A,p,q,i)Else return RandSelect(A,q+1,r,i-k).First call: RandSelect(A,1,n,i).Returns the i-th smallest element in A[p..r].
7 Selection (cont.)RandPartition (see 8.3, 8.4 textbook) gives partition with low side:1 element with probability 2/nj elements with probability 1/n, for j=2,3,…,n.Assume i-th element always on larger side:T(n)≤(T(max(1,n-1)+Σk=1..n-1T(max(k,n-k)))/n+O(n)≤(T(n-1)+2 Σk=n/2..n-1T(k))/n+O(n)=2(Σk=n/2..n-1T(k))/n+O(n), since T(n-1)=O(n2).Then T(n)=O(n) (proof by substitution).
8 Worst Case Linear Time Selection O(n) worst case algorithm.Works in similar way: recursively partition input arrayIdea: guarantee good splitE.g., in QuickSort assume at each recursion level have T(n)=T(9n/10)+T(n/10)+O(n).Then, T(n)=O(n log n).Use deterministic partitioning:Compute the element to partition around.
9 Steps to find i-th smallest element Algorithm Select Divide elements in n/5 groups of 5 elements, plus at most one group with (n mod 5) elements.Find median of each group:Insertion sort: O(1) time (at most 5 elements).Take middle element (largest if two medians).Use Select recursively to find median x of medians.
10 Algorithm Select (cont.) Partition input array around median-of-medians x. Let k be the number of elements on low side, n-k on high side.a1,a2,…,ak | ak+1,ak+2,…,anai < aj, for 1 ≤ i ≤ k, k+1 ≤ j ≤ n.Use Select recursively to:Find i-th smallest element on low side, if i ≤ kFind (i-k)-th smallest on high side, if i > k.
11 Analysis Find lower bound on number of elements greater than x. At least half of medians in step 2 greater than x. Then,At least half of the groups contribute 3 elements that are greater than x, except:Last group (if less than 5 elements);x own group.Discard those two groups:Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6.Similarly, number of elements smaller than x is ≥3n/10-6.Then, in worst case, Select is called recursively in Step 5 on at most 7n/10+6 elements (upper bound).
12 Analysis (cont.) Steps 1,2 and 4: O(n) time. Step 3: T(n/5) Step 5: at most T(7n/10+6)7n/10+6 < n for n > 20.T(n) ≤ T(|¯n/5¯|)+T(7n/10+6)+O(n), n > n1.Use substitution to solve:Assume T(n) ≤ cn, for n > n1; find n1 and c.
13 Analysis (cont.) T(n) ≤ c|¯n/5¯| + c(7n/10+6) + O(n) ≤ cn/5 + c + 7cn/10 + 6c +O(n)= 9cn/10 + 7c + O(n)Want T(n) ≤ cn:Pick c such that c(n/10-7) ≥ c1n, where c1 is constant from O(n) above (n1 = 80).
14 Questions Why not groups of 7 elements? Why not groups of 3 elements? T(n)=O(?)