An Approximation Algorithm for Binary Searching in Trees Marco Molinaro Carnegie Mellon University joint work with Eduardo Laber (PUC-Rio)

Slides:



Advertisements
Similar presentations
Algorithms Algorithm: what is it ?. Algorithms Algorithm: what is it ? Some representative problems : - Interval Scheduling.
Advertisements

Lower Bounds for Sorting, Searching and Selection
Introduction to Computer Science 2 Lecture 7: Extended binary trees
1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
University of Minnesota 1 Exploiting Page-Level Upper Bound (PLUB) for Multi-Type Nearest Neighbor (MTNN) Queries Xiaobin Ma Advisor: Shashi Shekhar Dec,
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
CS420 lecture one Problems, algorithms, decidability, tractability.
© The McGraw-Hill Companies, Inc., Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Discrete Structure Li Tak Sing( 李德成 ) Lectures
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
CSC 2300 Data Structures & Algorithms March 27, 2007 Chapter 7. Sorting.
2 -1 Analysis of algorithms Best case: easiest Worst case Average case: hardest.
Robust Network Design with Exponential Scenarios By: Rohit Khandekar Guy Kortsarz Vahab Mirrokni Mohammad Salavatipour.
Testing Metric Properties Michal Parnas and Dana Ron.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
On Stochastic Minimum Spanning Trees Kedar Dhamdhere Computer Science Department Joint work with: Mohit Singh, R. Ravi (IPCO 05)
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Quantum Algorithms II Andrew C. Yao Tsinghua University & Chinese U. of Hong Kong.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
The Complexity of Algorithms and the Lower Bounds of Problems
Summary of Algo Analysis / Slide 1 Algorithm complexity * Bounds are for the algorithms, rather than programs n programs are just implementations of an.
Binary Trees Chapter 6.
Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.
Chapter Complexity of Algorithms –Time Complexity –Understanding the complexity of Algorithms 1.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
CSCE350 Algorithms and Data Structure Lecture 17 Jianjun Hu Department of Computer Science and Engineering University of South Carolina
14/13/15 CMPS 3130/6130 Computational Geometry Spring 2015 Windowing Carola Wenk CMPS 3130/6130 Computational Geometry.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Transitive-Closure Spanner of Directed Graphs Kyomin Jung KAIST 2009 Combinatorics Workshop Joint work with Arnab Bhattacharyya MIT Elena Grigorescu MIT.
Advance Data Structure 1 College Of Mathematic & Computer Sciences 1 Computer Sciences Department م. م علي عبد الكريم حبيب.
The Binary Heap. Binary Heap Looks similar to a binary search tree BUT all the values stored in the subtree rooted at a node are greater than or equal.
Data Structure & Algorithm II.  Delete-min  Building a heap in O(n) time  Heap Sort.
The Lower Bounds of Problems
Télécom 2A – Algo Complexity (1) Time Complexity and the divide and conquer strategy Or : how to measure algorithm run-time And : design efficient algorithms.
A Study of Balanced Search Trees: Brainstorming a New Balanced Search Tree Anthony Kim, 2005 Computer Systems Research.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Sorting.
Priority Queues Two kinds of priority queues: Min priority queue. Max priority queue. Nov 4,
Hierarchical Well-Separated Trees (HST) Edges’ distances are uniform across a level of the tree Stretch  = factor by which distances decrease from root.
Foundation of Computing Systems
Bahareh Sarrafzadeh 6111 Fall 2009
LIMITATIONS OF ALGORITHM POWER
Lecture 9COMPSCI.220.FS.T Lower Bound for Sorting Complexity Each algorithm that sorts by comparing only pairs of elements must use at least 
Lecture 10COMPSCI.220.FS.T Binary Search Tree BST converts a static binary search into a dynamic binary search allowing to efficiently insert and.
Optimization/Decision Problems Optimization Problems – An optimization problem is one which asks, “What is the optimal solution to problem X?” – Examples:
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
Decision Trees DEFINITION: DECISION TREE A decision tree is a tree in which the internal nodes represent actions, the arcs represent outcomes of an action,
David Kauchak cs062 Spring 2010
Decision trees Polynomial-Time
Paweł Gawrychowski, Nadav Krasnopolsky, Shay Mozes, Oren Weimann
KD Tree A binary search tree where every node is a
Orthogonal Range Searching and Kd-Trees
(2,4) Trees 11/15/2018 9:25 AM Sorting Lower Bound Sorting Lower Bound.
The Complexity of Algorithms and the Lower Bounds of Problems
Bin Fu Department of Computer Science
Elementary graph algorithms Chapter 22
(2,4) Trees 12/4/2018 1:20 PM Sorting Lower Bound Sorting Lower Bound.
The Lower Bounds of Problems
Minimizing the Aggregate Movements for Interval Coverage
(2,4) Trees 2/28/2019 3:21 AM Sorting Lower Bound Sorting Lower Bound.
An O(n log n)-Time Algorithm for the k-Center Problem in Trees
Binary Search on a Tree Shay Mozes (Brown University)
Elementary graph algorithms Chapter 22
David Kauchak cs302 Spring 2012
Class 11 Max Flows Obtain a network, and use the same network to illustrate the shortest path problem for communication networks, the max flow.
Time Complexity and the divide and conquer strategy
Presentation transcript:

An Approximation Algorithm for Binary Searching in Trees Marco Molinaro Carnegie Mellon University joint work with Eduardo Laber (PUC-Rio)

Searching in sorted lists Sorted list of numbers Marked number m Find the marked number using queries ‘ x ≤ m? ’

Searching in sorted lists Search strategy: procedure that indicates which number should be queried next Can be represented by a decision tree (DT) # queries to find m = path length > > > > >≤ ≤ ≤ ≤ ≤ DT

Searching in sorted lists We are given the probability of each number being the marked one Expected number of queries of a strategy = expected path length of the corresponding decision tree Efficient strategy is one with minimum expected path > > > > >≤ ≤ ≤ ≤ ≤ 0,05 0, ,10,5... 0,5 0,05 0,1 0,2 0,1

Searching in trees Tree with exactly one marked node m We can query an arc and find out which endpoint is closer to the marked node

Searching in trees Search strategy: procedure that indicates which arc should be queried next Can be represented by a decision tree (c,d) (a,b)(f,h) (d,f) f b ~a ~f ~d ~h ~b ~c ~f DT h f d b a c

Searching in trees Search strategy: procedure that indicates which arc should be queried next Can be represented by a decision tree # queries to find m = path length (c,d) (a,b)(f,h) (d,f) f b ~a ~f ~d ~h ~b ~c ~f DT h f d b a c (c,d) (f,h) (d,f) f

Searching in trees We are given the probability of each node being the marked one Expected number of queries is the expected path length of the corresponding decision tree The goal is to find a DT with minimum expected path (c,d) (a,b)(f,h) (d,f) f b ~a ~f ~d ~h ~b ~c ~f h f a b d c.2.1.3

Searching in trees Def: Given a tree T and weights w, compute a decision tree for searching in T with minimum expected path from root to leaves w.r.t. w Motivation  Generalizes searches in totally ordered structures to (one type of) partially ordered structures  Application to software testing and filesystem synchronization

Related work Searching in sorted lists  Worst-case Binary search is optimal  Average-case Knuth [Acta Informatica 71]: O(n 2 ) de Prisco, de Santis [IPL 93]: good approximation in linear time

Related work Searching in trees  Worst-case Ben-Asher et al. [SIAM J. Comput. 99]: O(n 4 log 3 n) Onak, Parys [FOCS 06]: O(n 3 ) Mozes et al. [SODA 08]: O(n)  Average-case Kosaraju et al. [WADS 99]: O(log n) -approximation

Related work Searching in posets  Worst-case Arkin et al. [Int. J. Comput. Geometry Appl. 98]: O(log n) -approximation Carmo et al. [TCS 04]  Finding optimal strategy is NP-Hard  Constant-factor approximation for random posets  Average-case Kosaraju et al. [WADS 99]: O(log n) -approximation

Our results First constant-factor approximation for searching in trees (average-case metric) Linear running time

Overview We know how to search in sorted lists with probabilities Searching in paths = searching in ordered lists

Overview Search strategy

Algorithm 1. Find a (heavy) path 2. Compute a decision tree for this path 3. Append decision trees for querying the hanging arcs 4. Recursively find strategies for the hanging subtrees and append them

Analysis T – input tree w(u) – likelihood of node u being the marked one w(T’) = ∑ u є T’ w(u) T i j – Hanging subtrees of T Cost of a decision tree – expected path length input tree T subtrees T i j

Analysis – upper bound ALGO( T ) = expected path of the computed DT = cost(■) + cost(■) + cost(■) ≤ H + w(T) + ∑ i,j j w(T i j ) + ∑ i,j ALGO (T i j ) entropy of { w(u) } input tree T decision tree

Analysis – lower bounds When H >> w(T)  UB and LB1 When H ≤ w(T)  UB and (LB1 + LB2) only when H is large for all H, ALGO( T ) ≤ α OPT( T ) UB: LB1: LB2:

Analysis – entropy lower bound OPT( T ) = from root to ( ■ ) + from ( ■ ) to ( ■ ) + from ( ■ ) to leaves from root to ( ■ ): using Shannon’s lossless coding theorem, we can lower bound by H / log 3 – w ( T ) from ( ■ ) to ( ■ ):  There are at most 2 purple nodes per level from (■) to leaves:  Every query to arcs in the trees T i j are descendants of purple nodes  Costs at least as much as searching inside the trees T i j, namely ∑ i,j OPT( T i j ) D* ≥  These paths cost

Analysis – alternative lower bound OPT( T ) ≥ from root to ( ■ ) + from ( ■ ) to leaves from root to ( ■ ):  Costs = ∑ i,j distance to i-th purple node. w ( T i j )  At most one purple node can have distance 0  w ( T i j ) ≤ w(T)/2  Costs at least w(T)/2 from (■) to leaves:  Costs at least as much as searching inside the trees T i j, namely ∑ i,j OPT (T i j ) D*

Efficient implementation Most steps take linear time In order to find a good strategy, the algorithm uses sorting of weights  Use linear time approximate sorting The algorithm can be implemented in linear time

Conclusions First constant-factor approximation for searching in trees (average-case) Linear running time Open questions  Is searching in trees polynomially solvable?  Improved approximations for more general posets

Thank you!