# Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.

## Presentation on theme: "Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006."— Presentation transcript:

Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006

2 Augmentation is a process of extending a data structure in order to support additional functionality. It consists of four steps: 1.Choose an underlying data structure. 2.Determine the additional information to be maintained in the underlying data structure. 3.Verify that the additional information can be maintained for the basic modifying operations on the underlying data structure. 4.Develop new operations. Augmentation Process

3 Examples for Augmenting DS Dynamic order statistics: Augmenting binary search trees by size information D-dimensional range trees: Recursive construction of (static) d-dim range trees Min-augmented dynamic range trees: Augmenting 1-dim range trees by min- information Interval trees Priority search trees

4 Examples for Augmenting DS Dynamic order statistics: Augmenting binary search trees by size information D-dimensional range trees: Recursive construction of (static) d-dim range trees Min-augmented dynamic range trees: Augmenting 1-dim range trees by min- information Interval trees Priority search trees

5 Problem: Given a set S of numbers that changes under insertions and deletions, construct a data structure to store S that can be updated in O(log n) time and that can report the k-th order statistic for any k in O(log n) time. 51 85 13 34 22 7 14 48 5 S Dynamic Order Statistics

6 Binary Search Trees and Order Statistics 1 513 7 17 19 37 25 33 49 18

7 Binary Search Trees and Order Statistics 1 513 7 17 19 37 25 33 49 18 Retrieving an element with a given rank: For a given i, find the i-th smallest key in the set. Determining the rank of an element: For a given (pointer to a) key k, determine the rank of k in the set of keys.

8 Every node v stores two pieces of information: Its key The number of its descendants (The size of the subtree with root v) Augmenting the Data Structure 4 17 21 33 48 51 73 92 81 9 124 1 1 1 1 1 2 24 46 11

9 Find the rank of key x in the tree with root node v: Rank(v, x) 1if x = key(v) 2then return 1 + size(left(v)) 3if x < key(v) 4then return Rank(left(v), x) 5else return 1 + size(left(v)) + Rank(right(v), x) How To Determine The Rank of an Element 4 17 21 33 48 51 73 92 81 9 124 1 11 1 1 2 2 4 46 11

10 How to Find the k-th Order Statistic Find (a pointer to) the node containing the k-th smallest key in the subtree rooted at node v. Select(v, k) 1if k = size(left(v)) + 1 2then return v 3if k ≤ size(left(v)) 4then return Select(left(v), k) 5else return Select(right(v), k – 1 – size(left(v))) 4 17 21 33 48 51 73 92 81 9 124 1 11 1 1 2 2 4 46 11

Maintaining Subtree Sizes Under Insertions 4 17 21 33 48 51 73 92 81 9 124 1 1 1 1 1 2 24 46 11 Insert operation Insert node as into a standard binary search tree. Add 1 to the subtree size of every ancestor of the new node.

12 Maintaining Subtree Sizes Under Insertions 4 17 21 33 48 51 73 92 81 9 124 1 1 1 1 1 2 24 46 11 64 1 Insert operation Insert node as into a standard binary search tree Add 1 to the subtree size of every ancestor of the new node

13 Maintaining Subtree Sizes Under Insertions 4 17 21 33 48 51 73 92 81 9 124 1 1 1 1 1 3 25 47 12 64 1 Insert operation Insert node as into a standard binary search tree Add 1 to the subtree size of every ancestor of the new node

14 Maintaining Subtree Sizes Under Deletions Delete operation Delete node as from a standard binary search tree Subtract 1 from the subtree size of every ancestor of the deleted node

15 Maintaining Subtree Sizes Under Rotations s1s1 s2s2 s3s3 s4s4 s5s5 s1s1 s3s3 s5s5 s4s4 s 5 + s 3 + 1

16 Theorem: There exists a data structure to represent a dynamically changing set S of numbers with the following properties: The data structure can be updated in O(log n) time after every insertion or deletion into or from S. The data structure allows us to determine the rank of an element or to find the element with a given rank in O(log n) time. The data structure occupies O(n) space. Dynamic Order Statistics—Summary

17 Examples for Augmenting DS Dynamic order statistics: Augmenting binary search trees by size information D-dimensional range trees: Recursive construction of (static) d-dim range trees Min-augmented dynamic range trees: Augmenting 1-dim range trees by min- information Interval trees Priority search trees

18 4-Sided Range Queries

19 4-Sided Range Queries Goal: Build a static data structure of size O(n log n) that can answer 4-sided range queries in O(log 2 n + k) time.

20 Orthogonal d-dimensional Range Search Build a static data structure for a set P of n points in d-space that supports d-dim range queries: d-dim range query: Let R be a d-dim orthogonal hyperrectangle, given by d ranges [x 1, x 1 ‘], …, [x d, x d ‘]: Find all points p = (p 1, …, p d )  P such that x 1 ≤ p 1 ≤ x 1 ‘,…,x d ≤ p d ≤ x d. Special cases: 1-dim range query:2-dim range query: x1x1 x1x1 x1‘x1‘ x1‘x1‘ x2‘x2‘ x2x2

21 1-dim Range Search Standard binary search trees support also 1-dim range queries: 37 18 99 12 23 21 81 74 90 55 4261 49 68 30 80

22 1-dim Range Search Leaf-search-tree: 37 18 99 12 23 21 81 74 90 55 4261 49 68 30 80 ∞ 21 1812 49 42376861 55 8174 9990 23

1-dim Range Tree A 1-dim range tree is a leaf-search tree for the x-values (points on the line). Internal nodes have routers guiding the search to the leaves: We choose the maximal x-value in left subtree as router. Range search: In order to find all points in a given range [l, r] search for the boundary values l and r. This is a forked path; report all leaves of subtrees rooted at nodes v in between the two search paths whose parents are on the search path.

24 The selected subtrees lr Split node

25 Canonical Subsets The canonical subset of node v, P(v), is the subset of points of P stored at the leaves of the subtree rooted at v. If v is a leaf, P(v) is the point stored at this leaf. If v is the root, P(v) = P. Observations: For each query range [l, r] the set of points with x-coordinates falling into this range is the disjoint union of O(log n) canonical subsets of P. A node v is called an umbrella node for the range [l, r], if the x-coordinates of all points in its canonical subset P(v) fall into the range, but this does not hold for the predecessor of v. All k points stored at the leaves of a tree rooted at node v, i.e. the k points in a canonical subset P(v), can be reported in time O(k).

26 1-dim Range Tree: Summary Let P be a set of n points in 1-dim space. P can be stored in a balanced binary leaf-search tree such that the following holds: Construction time: O(n log n) Space requirement: O(n) Insertion of a point: O(log n) time Deletion of a point: O(log n) time 1-dim-range-query: Reporting all k points falling into a given query range can be carried out in time O(log n + k). The performance of 1-dim range trees does not depend on the chosen balancing scheme!

27 2-dim Range tree: The Primary Structure Static binary leaf-search tree over x-coordinates of points.

28 The Primary Structure Static binary leaf-search tree over x-coordinates of points.

29 The Primary Structure Static binary leaf-search tree over x-coordinates of points.

30 The Primary Structure Static binary leaf-search tree over x-coordinates of points. Every leaf represents a vertical slab of the plane.

31 The Primary Structure Static binary leaf-search tree over x-coordinates of points. Every leaf represents a vertical slab of the plane. Every internal node represents a slab that is the union of the slabs of its children.

32 The Primary Structure Static binary leaf-search tree over x-coordinates of points. Every leaf represents a vertical slab of the plane. Every internal node represents a slab that is the union of the slabs of its children.

33 The Primary Structure Static binary leaf-search tree over x-coordinates of points. Every leaf represents a vertical slab of the plane. Every internal node represents a slab that is the union of the slabs of its children.

34 The Primary Structure Static binary leaf-search tree over x-coordinates of points. Every leaf represents a vertical slab of the plane. Every internal node represents a slab that is the union of the slabs of its children.

35 Answering 2-dim Range Queries Normalize queries to end on slab boundaries. Query decomposes into O(log n) subqueries. Every subquery is a 1-dimensional range query on y- coordinates of all points in the slab of the corresponding node. (x-coordinates do not matter!)

36 The selected subtrees lr Split node

37 Answering Queries Normalize queries to end on slab boundaries. Query decomposes into O(log n) subqueries. Every subquery is a 1-dimensional range query on y- coordinates of all points in the slab of the corresponding node. (x-coordinates do not matter!)

38 Answering Queries Normalize queries to end on slab boundaries. Query decomposes into O(lg n) subqueries. Every subquery is a 1-dimensional range query on y- coordinates of all points in the slab of the corresponding node. (x-coordinates do not matter!)

39 Answering Queries Normalize queries to end on slab boundaries. Query decomposes into O(log n) subqueries. Every subquery is a 1-dimensional range query on y- coordinates of all points in the slab of the corresponding node. (x-coordinates do not matter!)

40 2-dim Range Tree v TxTx T y (v) I x (v) x y

41 2-dim Range Tree A 2-dimensional range tree for storing a set P of n points in the x-y-plane is: A 1-dim-range tree T x for the x-coordinates of points. Each node v of T x has a pointer to a 1-dim-range-tree T y (v) storing all points which fall into the interval I x (v). That is: T y (v) is a 1-dim-range-tree based on the y- coordinates of all points p  P with p  I x (v). Leaf-search-tree on x-coordinates of points Leaf-search-tree on y-coordinates of poins v

42 2-dim Range Tree A 2-dim range tree on a set of n points in the plane requires O(n log n) space. p p p p A point p is stored in all associated range trees T y (v) for all nodes v on the search path to p x in T x. Hence, for each depth d, each point p occurs in only one associated search structure T y (v) for a node v of depth d in T x. The 2-dim range tree can be constructed in time O(n log n). (Presort the points on y-coordinates!)

43 The 2-Dimensional Range Tree Primary structure: Leaf-search tree on x-coordinates of points Every node stores a secondary structure: Balanced binary search tree on y- coordinates of points in the node’s slab. Every point is stored in secondary structures of O(log n) nodes. Space: O(n log n)

44 Answering Queries Every 2-dimensional range query decomposes into O(log n) 1- dimensional range queries Each such query takes O(log n + k′) time Total query complexity: O(log 2 n + k)

45 2-dim Range Query Let P be a set of points in the plane stored in a 2-dim range tree and let a 2-dim range R defined by the two intervals [x, x‘], [y, y‘] be given. The all k points of P falling into the range R can be reported as follows: 1.Determine the O(log n) umbrella nodes for the range [x, x‘], i.e. determine the canonical subsets of P that together contain exactly the points with x-coordinates in the range [x, x‘]. (This is a 1-dim range query on the x-coordinates.) 2.For each umbrella node v obtained in 1, use the associated 1-dim range tree T y (v) in order to select the subset P(v) of points with y-coordinates in the range [y, y‘]. (This is a 1-dim range query for each of the O(log n) canonical subsets obtained in 1.) Time to report all k points in the 2-dim range R: O(log 2 n + k). Query time can be reduced to O(log n +k) by a technique known as fractional cascading.

46 The 3-Dimensional Range Tree Primary structure: Search tree on x-coordinates of points Every node stores a secondary structure: 2-dimensional range tree on points in the node’s slab. Every point is stored in secondary structures of O(log n) nodes. Space: O(n log 2 n)

47 Answering Queries Every 3-dimensional range query decomposes into O(log n) 2- dimensional range queries Each such query takes O(log 2 n + k′) time Total query complexity: O(log 3 n + k)

48 d-Dimensional Range Queries Primary structure: Search tree on x-coordinates Secondary structures: (d – 1)-dimensional range trees Space requirement: O(n log d – 1 n) Query time: O(n log d – 1 n)

49 Updates are difficult! Insertion or deletion of a point p in a 2-dim range tree requires: 1.Insertion or deletion of p into the primary range tree T x according to the x- coordinate of p 2.For each node v on the search path to the leaf storing p in T x, insertion or deletion of p in the associated secondary range tree T y (v). Maintaining the primary range tree balanced is difficult, except for the case d = 1! Rotations in the primary tree may require to completely rebuild the associated range trees along the search path!

50 Range Trees–Summary Theorem: There exists a data structure to represent a static set S of n points in d dimensions with the following properties: The data structure allows us to answer range queries in O(log d n + k) time. The data structure occupies O(n log d – 1 n) space. Note: The query complexity can be reduced to O(log d – 1 n + k), for d ≥ 2, using a very beautiful technique called fractional cascading.

51 Examples for Augmenting DS Dynamic order statistics: Augmenting binary search trees by size information D-dimensional range trees: Recursive construction of (static) d-dim range trees Min-augmented dynamic range trees: Augmenting 1-dim range trees by min- information Interval trees Priority search trees

52 minXinRectangle Queries Problem: Given a set P of points that changes under insertions and deletions, construct a data structure to store P that can be updated in O(log n) time and that can find the point with minimal x-coordinate in a given range below a given threshold in O(log n) time. l r y0y0 minXinRectangle(l, r, y 0 ) Assumption: All points have pairwise different x-coordinates

53 minXinRectangle Queries l r y0y0 minXinRectangle(l, r, y 0 ) Assumption: All points have pairwise different x-coordinates

54 Min-augmented Range Tree (2, 12)(3, 4)(4, 11) (5, 3)(8, 5) (11, 21) (14, 7) (21, 8)(15, 2)(17, 30) 24 3 11 8 5 14 17 1521 2 2 2 3 43 3 3 Two data structures in one: Leaf-search tree on x-coordinates of points Min-tournament tree on y-coordinates of points 82

55 minXinRectangle(l, r, y 0 ) lr Split node Search for the boundary values l, r. Find the leftmost umbrella node with a min-field ≤ y 0.

56 minXinRectangle(l, r, y 0 ) lr Split node Search for the boundary values l, r. Find the leftmost umbrella node with a min-field ≤ y 0. Proceed to the left son of the current node, if its min-field is ≤ y 0, and to the right son, otherwise. Return the point at the leaf. minXinRectangle(l, r, y 0 ) can be found in time O(height of tree).

57 Updates Insert operation Insert node as into a standard binary leaf search tree. Adjust min-fields of every ancestor of the new node by playing a min tournament for each node and its sibling along the search path. Delete operation: Similar

58 Maintaining min-fields under Rotations s1s1 s2s2 s3s3 s4s4 s5s5 s1s1 s3s3 s5s5 s4s4 min{s 5, s 3 }

59 Min-augmented Range Trees–Summary Theorem: There exists a data structure to represent a dynamic set S of n points in the plane with the following properties: The data structure allows updates and to answer minXinRectangle(l, r, y 0 ) queries in O(log n) time. The data structure occupies O(n) space. Note: The data structure can be based on an arbitrary scheme of balanced binary leaf search trees.

Download ppt "Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006."

Similar presentations