Divide and Conquer / Closest Pair of Points Yin Tat Lee

Slides:



Advertisements
Similar presentations
Cse 521: design and analysis of algorithms Time & place T, Th pm in CSE 203 People Prof: James Lee TA: Thach Nguyen Book.
Advertisements

Advanced Algorithms Piyush Kumar (Lecture 12: Parallel Algorithms) Welcome to COT5405 Courtesy Baker 05.
Divide-and-Conquer The most-well known algorithm design strategy:
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 5 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
1 Divide-and-Conquer The most-well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
CS4413 Divide-and-Conquer
Prof. Swarat Chaudhuri COMP 482: Design and Analysis of Algorithms Spring 2013 Lecture 11.
1 Divide-and-Conquer Divide-and-conquer. n Break up problem into several parts. n Solve each part recursively. n Combine solutions to sub-problems into.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Divide and Conquer.
Divide and Conquer. Recall Complexity Analysis – Comparison of algorithm – Big O Simplification From source code – Recursive.
Advanced Algorithm Design and Analysis (Lecture 10) SW5 fall 2004 Simonas Šaltenis E1-215b
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
Robert Pless, CS 546: Computational Geometry Lecture #3 Last Time: Convex Hulls Today: Plane Sweep Algorithms, Segment Intersection, + (Element Uniqueness,
CS38 Introduction to Algorithms Lecture 7 April 22, 2014.
Algorithm Design Strategy Divide and Conquer. More examples of Divide and Conquer  Review of Divide & Conquer Concept  More examples  Finding closest.
Chapter 4 Divide-and-Conquer Copyright © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Administrivia, Lecture 5 HW #2 was assigned on Sunday, January 20. It is due on Thursday, January 31. –Please use the correct edition of the textbook!
Closest Pair of Points Computational Geometry, WS 2006/07 Lecture 9, Part II Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen,
CSE 421 Algorithms Richard Anderson Lecture 13 Divide and Conquer.
Design and Analysis of Algorithms - Chapter 41 Divide and Conquer The most well known algorithm design strategy: 1. Divide instance of problem into two.
Divide and Conquer The most well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances 2. Solve smaller instances.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 5 Instructor: Paul Beame TA: Gidon Shavit.
A. Levitin “Introduction to the Design & Analysis of Algorithms,” 3rd ed., Ch. 5 ©2012 Pearson Education, Inc. Upper Saddle River, NJ. All Rights Reserved.
1 Closest Pair of Points (from “Algorithm Design” by J.Kleinberg and E.Tardos) Closest pair. Given n points in the plane, find a pair with smallest Euclidean.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Chapter 5 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Divide and Conquer Applications Sanghyun Park Fall 2002 CSE, POSTECH.
CSE 421 Algorithms Lecture 15 Closest Pair, Multiplication.
Lecture 5 Today, how to solve recurrences We learned “guess and proved by induction” We also learned “substitution” method Today, we learn the “master.
CSC317 1 So far so good, but can we do better? Yes, cheaper by halves... orkbook/cheaperbyhalf.html.
CMPT 238 Data Structures More on Sorting: Merge Sort and Quicksort.
Chapter 11 Sorting Acknowledgement: These slides are adapted from slides provided with Data Structures and Algorithms in C++, Goodrich, Tamassia and Mount.
Introduction to Algorithms
Chapter 5 Divide and Conquer
Subject Name: Design and Analysis of Algorithm Subject Code: 10CS43
Chapter 4 Divide-and-Conquer
Divide-and-Conquer 6/30/2018 9:16 AM
COMP 53 – Week Seven Big O Sorting.
Chapter 4 Divide-and-Conquer
Divide and Conquer.
CSCE350 Algorithms and Data Structure
Jordi Cortadella and Jordi Petit Department of Computer Science
CSCE 411 Design and Analysis of Algorithms
CS 3343: Analysis of Algorithms
Divide and Conquer / Closest Pair of Points Yin Tat Lee
Chapter 5 Divide and Conquer
Punya Biswas Lecture 15 Closest Pair, Multiplication
Divide-and-Conquer 7 2  9 4   2   4   7
Master Theorem Yin Tat Lee
Topic: Divide and Conquer
Richard Anderson Lecture 13 Divide and Conquer
cse 521: design and analysis of algorithms
Chapter 4 Divide-and-Conquer
Divide and Conquer Algorithms Part I
Divide-and-Conquer The most-well known algorithm design strategy:
Divide-and-Conquer 7 2  9 4   2   4   7
Trevor Brown CS 341: Algorithms Trevor Brown
CSCI 256 Data Structures and Algorithm Analysis Lecture 12
Lecture 15, Winter 2019 Closest Pair, Multiplication
Introduction to Algorithms
CSC 380: Design and Analysis of Algorithms
The Selection Problem.
Richard Anderson Lecture 14, Winter 2019 Divide and Conquer
Richard Anderson Lecture 14 Divide and Conquer
Divide-and-Conquer 7 2  9 4   2   4   7
Lecture 15 Closest Pair, Multiplication
Given a list of n  8 integers, what is the runtime bound on the optimal algorithm that sorts the first eight? O(1) O(log n) O(n) O(n log n) O(n2)
Divide and Conquer Merge sort and quick sort Binary search
Presentation transcript:

Divide and Conquer / Closest Pair of Points Yin Tat Lee CSE 421 Divide and Conquer / Closest Pair of Points Yin Tat Lee

Where is the mistake? Let 𝑃 𝑛 ="1+1+⋯(𝑛 𝑡𝑖𝑚𝑒𝑠)⋯+1=𝑂(1)” Base case (𝑛=1): 1=𝑂(1) is true. Induction Hypothesis: Suppose 𝑃(𝑘) holds. Induction Step: (Goal: show 𝑃(𝑘+1) holds) By induction hypothesis, 1+1+⋯ 𝑘+1 𝑡𝑖𝑚𝑒𝑠 ⋯+1=𝑂 1 +1=𝑂 1 . Hence, 𝑃 𝑘+1 holds. This finishes the induction. Now, we know that 𝑛=𝑂(1)  Problem: Hidden constant in the 𝑂(⋯) is increasing in the induction. Almost all students give a similar wrong proof in HW3 Q1. For midterm/final, we won’t catch you on such small details. But it is the key point for this problem. So, we can’t forgive this.

Homework comments Avoid writing code in the homework. It is often harder to understand. You need to write much more detail. We are unfair: We deduct point for bug in code. We do not deduct point for grammar mistake in pseudocode. See how pseudocode is done in the slide.

Midterm Date: Next Mon (Oct 29) Time: 1:30-2:20. Location: Same. Admin said there is no room left with large table . Coverage: All materials up to this Friday. Open book, open notes, open printout, hard copies only. Half of the score are MC/Fill in the blank/simple questions. This Friday (Oct 19): midterm review. Go over some sample questions that is relevant to the midterm.

Divide and Conquer Approach

Divide and Conquer We reduce a problem to several subproblems. Typically, each sub-problem is at most a constant fraction of the size of the original problem Recursively solve each subproblem Merge the solutions Examples: Mergesort, Binary Search, Strassen’s Algorithm, n Log n levels n/2 n/2 n/4

A Classical Example: Merge Sort recursively Split to 𝑛/2 merge

Why Balanced Partitioning? An alternative "divide & conquer" algorithm: Split into n-1 and 1 Sort each sub problem Merge them Runtime 𝑇 𝑛 =𝑇 𝑛−1 +𝑇 1 +𝑛 Solution: 𝑇 𝑛 =𝑛+𝑇 𝑛−1 +𝑇 1 =𝑛+𝑛−1+ 𝑇 𝑛−2 =𝑛+𝑛−1+𝑛−2+𝑇 𝑛−3 =𝑛+𝑛−1+𝑛−2+…+1=𝑂( 𝑛 2 )

Reinventing Mergesort Suppose we've already invented Bubble-Sort, and we know it takes 𝑛 2 Try just one level of divide & conquer: Bubble-Sort (first n/2 elements) Bubble-Sort (last n/2 elements) Merge results Time: 2 𝑇(𝑛/2) + 𝑛 = 𝑛2/2 + 𝑛 ≪ 𝑛2 Almost twice as fast! D&C in a nutshell

Reinventing Mergesort “the more dividing and conquering, the better” Two levels of D&C would be almost 4 times faster, 3 levels almost 8, etc., even though overhead is growing. Best is usually full recursion down to a small constant size (balancing "work" vs "overhead"). In the limit: you’ve just rediscovered mergesort! Even unbalanced partitioning is good, but less good Bubble-sort improved with a 0.1/0.9 split: .1𝑛 2 + .9𝑛 2 + 𝑛 = .82𝑛2 + 𝑛 The 18% savings compounds significantly if you carry recursion to more levels, actually giving 𝑂(𝑛 log 𝑛 ), but with a bigger constant. This is why Quicksort with random splitter is good – badly unbalanced splits are rare, and not instantly fatal.

Finding the Root of a Function

Finding the Root of a Function Given a continuous function f and two points 𝑎<𝑏 such that 𝑓 𝑎 ≤0 𝑓 𝑏 ≥0 Goal: Find a point 𝑐 where 𝑓 𝑐 is close to 0. 𝑓 has a root in [𝑎,𝑏] by intermediate value theorem Note that roots of 𝑓 may be irrational, So, we want to approximate the root with an arbitrary precision! 𝑓(𝑥)= sin 𝑥 − 100 𝑥 + 𝑥 4 a b

A Naive Approach Suppose we want 𝜖 approximation to a root. Divide [𝑎,𝑏] into 𝑛= 𝑏−𝑎 𝜖 intervals. For each interval check 𝑓 𝑥 ≤0,𝑓 𝑥+𝜖 ≥0 This runs in time 𝑂 𝑛 =𝑂( 𝑏−𝑎 𝜖 ) Can we do faster? a b

Divide & Conquer (Binary Search) Bisection (𝑎,𝑏,𝜀) if 𝑏−𝑎 <𝜀 then return 𝑎; else 𝑚←(𝑎+𝑏)/2; if 𝑓 𝑚 ≤0 then return Bisection(𝑐,𝑏,𝜀); return Bisection(𝑎,𝑐,𝜀); a c b

Time Analysis Let 𝑛= 𝑎−𝑏 𝜖 be the # of intervals and 𝑐=(𝑎+𝑏)/2 Always half of the intervals lie to the left and half lie to the right of c So, 𝑇 𝑛 =𝑇 𝑛 2 +𝑂(1) i.e., 𝑇 𝑛 =𝑂( log 𝑛)=𝑂( log ( 𝑎−𝑏 𝜖 ) ) a c b n/2 n/2 For 𝑑 dimension , “Binary search” can be used to minimize convex functions. The current best algorithms take 𝑂( 𝑑 3 log 𝑂 1 (𝑑/𝜖) ). Unfortunately, the algorithm is unusably complicated.

Fast Exponentiation

Fast Exponentiation Power(𝑎,𝑛) Obvious algorithm Observation: Input: integer 𝒏≥𝟎 and number 𝒂 Output: 𝒂 𝒏 Obvious algorithm 𝒏−𝟏 multiplications Observation: if 𝒏 is even, then 𝑎 𝑛 = 𝑎 𝑛/2 ⋅ 𝑎 𝑛/2 .

Divide & Conquer (Repeated Squaring) Power(𝒂,𝒏) { if (𝒏=𝟎) return 𝟏 else if (𝒏 is even) return Power(𝒂,𝒏/𝟐) Power(𝒂,𝒏/2) else return Power(𝒂,(𝒏−𝟏)/𝟐) Power(𝒂,(𝒏−𝟏)/2)  𝒂 } Is there any problem in the program? 𝑘=Power(𝑎,𝑛/2); return 𝑘𝑘; 𝑘=Power(𝑎,(𝑛−1)/2); return 𝑘𝑘𝑎; Time (# of multiplications): 𝑇 𝑛 ≤𝑇( 𝑛/2 )+2 for 𝑛≥1 𝑇 0 =0 Solving it, we have 𝑇 𝑛 ≤𝑇( 𝑛/2 )+2≤𝑇( 𝑛/4 )+2+2 ≤⋯≤𝑇 1 +2+⋯+2≤2 log 2 𝑛 . log 𝟐 (𝒏) copies

Finding the Closest Pair of Points

Closest Pair of Points (general metric) Given 𝑛 points and arbitrary distances between them, find the closest pair. Must look at all 𝑛 2 pairwise distances, else any one you didn’t check might be the shortest. i.e., you have to read the whole input. (… and all the rest of the 𝑛 2 edges…)

Closest Pair of Points (1-dimension) Given 𝑛 points on the real line, find the closest pair, e.g., given 11, 2, 4, 19, 4.8, 7, 8.2, 16, 11.5, 13, 1 find the closest pair Fact: Closest pair is adjacent in ordered list So, first sort, then scan adjacent pairs. Time 𝑂(𝑛 log 𝑛) to sort, if needed, Plus 𝑂(𝑛) to scan adjacent pairs Key point: do not need to calculate distances between all pairs: exploit geometry + ordering 1 2 4 4.8 7 8.2 11 11.5 13 16 19

Closest Pair of Points (2-dimensions) Given 𝑛 points in the plane, find a pair with smallest Euclidean distance between them. Fundamental geometric primitive. Graphics, computer vision, geographic information systems, molecular modeling, air traffic control. Special case of nearest neighbor, Euclidean MST, Voronoi. Brute force: Check all pairs of points in Θ( 𝑛 2 ) time. Assumption: No two points have same 𝑥 coordinate.

Closest Pair of Points (2-dimensions) No single direction along which one can sort points to guarantee success!

Divide & Conquer Divide: draw vertical line 𝐿 with ≈ 𝑛/2 points on each side. Conquer: find closest pair on each side, recursively. Combine to find closest pair overall Return best solutions How ? L 8 21 12

Why the strip problem is easier? Key Observation Suppose 𝛿 is the minimum distance of all pairs in left/right of 𝐿. 𝛿= min 12,21 =12. Key Observation: suffices to consider points within 𝛿 of line 𝐿. Almost the one-D problem again: Sort points in 2𝛿-strip by their 𝑦 coordinate. =12 L 7 1 2 3 4 5 6 21 12

Almost 1D Problem Partition each side of 𝐿 into 𝛿 2 × 𝛿 2 squares Claim: No two points lie in the same 𝛿 2 × 𝛿 2 box. Proof: Such points would be within 𝛿 2 2 + 𝛿 2 2 =𝛿 1 2 ≈0.7𝛿<𝛿 Let 𝑠 𝑖 have the 𝑖 𝑡ℎ smallest 𝑦-coordinate among points in the 2𝛿-width-strip. Claim: If 𝑖−𝑗 >11, then the distance between 𝑠 𝑖 and 𝑠 𝑗 is >𝛿. Proof: only 11 boxes within 𝛿 of 𝑦( 𝑠 𝑖 ). j 49 >𝛿 31 30 ½ 29 29 ½ 28 i 27 26 25  

Closest Pair (2 dimension) Where is the bottleneck? Closest Pair (2 dimension) Closest-Pair( 𝒑 𝟏 , 𝒑 𝟐 ,⋯, 𝒑 𝒏 ) { if(𝒏≤𝟐) return | 𝒑 𝟏 − 𝒑 𝟐 | Compute separation line 𝑳 such that half the points are on one side and half on the other side. 𝜹 𝟏 = Closest-Pair(left half) 𝜹 𝟐 = Closest-Pair(right half) 𝜹 = min( 𝜹 𝟏 , 𝜹 𝟐 ) Delete all points further than  from separation line L Sort remaining points p[1]…p[m] by y-coordinate. for 𝒊=𝟏,𝟐,⋯,𝒎 for k = 𝟏,𝟐,⋯,𝟏𝟏 if 𝒊+𝒌≤𝒎  = min(, distance(p[i], p[i+k])); return . }

Closest Pair Analysis Let 𝐷(𝑛) be the number of pairwise distance calculations in the Closest-Pair Algorithm 𝐷 𝑛 ≤ 1 if 𝑛=1 2𝐷 𝑛 2 +11 𝑛 o.w. ⇒𝐷 𝑛 =O(𝑛log 𝑛) BUT, that’s only the number of distance calculations What if we counted running time? 𝑇 𝑛 ≤ 1 if 𝑛=1 2𝑇 𝑛 2 +𝑂(𝑛 log 𝑛) o.w. ⇒𝐷 𝑛 =O(𝑛 log 2 𝑛)

Closest Pair (2 dimension) Improved if(𝒏≤𝟐) return | 𝒑 𝟏 − 𝒑 𝟐 | Compute separation line 𝑳 such that half the points are on one side and half on the other side. ( 𝜹 𝟏 , 𝒑 𝟏 ) = Closest-Pair(left half) ( 𝜹 𝟐 , 𝒑 𝟐 ) = Closest-Pair(right half) 𝜹 = min( 𝜹 𝟏 , 𝜹 𝟐 ) 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 = merge( 𝒑 𝟏 , 𝒑 𝟐 ) (merge sort it by y-coordinate) Let 𝒒 be points (ordered as 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 ) that is 𝜹 from line L. for 𝒊=𝟏,𝟐,⋯,𝒎 for k = 𝟏,𝟐,⋯,𝟏𝟏 if 𝒊+𝒌≤𝒎  = min(, distance(q[i], q[i+k])); return  and 𝒑 𝒔𝒐𝒓𝒕𝒆𝒅 . } 𝑇 𝑛 ≤ 1 if 𝑛=1 2𝑇 𝑛 2 +𝑂 𝑛 o.w. ⇒𝐷 𝑛 =𝑂(𝑛 log 𝑛)