VBI-Tree: A Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes Presenter: Quang Hieu Vu H.V.Jagadish, Beng Chin Ooi, Quang Hieu Vu,

Slides:



Advertisements
Similar presentations
AVL Trees binary tree for every node x, define its balance factor
Advertisements

Scalable Content-Addressable Network Lintao Liu
AVL Trees Balancing. The AVL Tree An AVL tree is a balanced binary search tree. What does it mean for a tree to be balanced? It means that for every node.
S. Sudarshan Based partly on material from Fawzi Emad & Chau-Wen Tseng
Tree Data Structures &Binary Search Tree 1. Trees Data Structures Tree  Nodes  Each node can have 0 or more children  A node can have at most one parent.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
1Department of Electrical Engineering and Computer Science, University of Michigan, USA. 2Department of Computer Science, National University of Singapore,
Binary Trees Chapter 6. Linked Lists Suck By now you realize that the title to this slide is true… By now you realize that the title to this slide is.
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
CS Data Structures Chapter 10 Search Structures (Selected Topics)
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Rooted Trees. More definitions parent of d child of c sibling of d ancestor of d descendants of g leaf internal vertex subtree root.
Self-Balancing Search Trees Chapter 11. Chapter 11: Self-Balancing Search Trees2 Chapter Objectives To understand the impact that balance has on the performance.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Self-Balancing Search Trees Chapter 11. Chapter Objectives  To understand the impact that balance has on the performance of binary search trees  To.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
P2P Course, Structured systems 1 Introduction (26/10/05)
E.G.M. PetrakisB-trees1 Multiway Search Tree (MST)  Generalization of BSTs  Suitable for disk  MST of order n:  Each node has n or fewer sub-trees.
Binary Trees Chapter 6.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Binary Tree. Binary Trees – An Informal Definition A binary tree is a tree in which no node can have more than two children Each node has 0, 1, or 2 children.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
CS Data Structures Chapter 10 Search Structures.
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Balanced Trees. Maintaining Balance Binary Search Tree – Height governed by Initial order Sequence of insertion/deletion – Changes occur at leaf nodes.
Chapter 13 B Advanced Implementations of Tables – Balanced BSTs.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Starting at Binary Trees
 Trees Data Structures Trees Data Structures  Trees Trees  Binary Search Trees Binary Search Trees  Binary Tree Implementation Binary Tree Implementation.
M-ary Trees. m-ary trees Some trees need to be searched efficiently, but have more than two children l parse trees l game trees l genealogical trees,
Chapter 2: Basic Data Structures. Spring 2003CS 3152 Basic Data Structures Stacks Queues Vectors, Linked Lists Trees (Including Balanced Trees) Priority.
QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.
Trees : Part 1 Section 4.1 (1) Theory and Terminology (2) Preorder, Postorder and Levelorder Traversals.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
CSE 3358 NOTE SET 13 Data Structures and Algorithms.
Week 7 - Wednesday.  What did we talk about last time?  Recursive running time  Master Theorem  Symbol tables.
Tree Representation and Terminology Binary Trees Binary Search Trees Pointer-Based Representation of a Binary Tree Array-Based Representation of a Binary.
Spatial Data Management
TCSS 342, Winter 2006 Lecture Notes
AA Trees.
Red-Black Tree Neil Tang 02/04/2010
Red Black Trees Colored Nodes Definition Binary search tree.
Red Black Trees Colored Nodes Definition Binary search tree.
Multiway Search Trees Data may not fit into main memory
UNIT III TREES.
Splay Trees Binary search trees.
AVL DEFINITION An AVL tree is a binary search tree in which the balance factor of every node, which is defined as the difference between the heights of.
Introduction Applications Balance Factor Rotations Deletion Example
AVL Trees A BST in which, for any node, the number of levels in its two subtrees differ by at most 1 The height of an empty tree is -1. If this relationship.
ITEC 2620M Introduction to Data Structures
Tree data structure.
Splay Trees Binary search trees.
Data Structures and Database Applications Binary Trees in C#
Chapter 6 Transform and Conquer.
Trees 1: Theory, Models, Generic Heap Algorithms, Priority Queues
Wednesday, April 18, 2018 Announcements… For Today…
Tree data structure.
The DSW Algorithm The building block for tree transformations in this algorithm is the rotation There are two types of rotation, left and right, which.
AVL Search Tree put(9)
AVL Tree By Rajanikanth B.
Red Black Trees Colored Nodes Definition Binary search tree.
Presentation transcript:

VBI-Tree: A Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes Presenter: Quang Hieu Vu H.V.Jagadish, Beng Chin Ooi, Quang Hieu Vu, Rong Zhang, Aoying Zhou

VBI (Virtual Binary Index) -Tree framework

BATON: BAlanced Tree Overlay Network (VLDB’05) Definition: A tree is balanced if and only if at any node in the tree the height of its two subtrees differ by at most one. Binary Balanced Tree Index Architecture m

Properties Property 1: A tree is a balanced tree if every node in the tree that has a child also has both its left and right routing tables full A routing table is full if none of its valid links is NULL Property 2: If a node x contains a link to another node y in its left or right routing tables, the parent node of x must also contain a link to the parent node of y unless the same node is parent of both x and y

VBI-Tree structure VBI-Tree Structure

VBI-Tree structure Differences with BATON: Two kinds of nodes: routing nodes (internal nodes) and data nodes (leaf nodes). Each peer node is in charge of a pair of node: routing node and data node Each node has an additional upside path which keeps information of regions covered by the node’s ancestors. Theorem: In an in-order traversal of the VBI-Tree, data nodes and routing nodes alternate.

Node join Two phases: First phase: determine where the new node should join Similar to BATON’s join algorithm except that only routing nodes are considered Second phase: the node accepting the new node splits its correspondence data node into two parts It keeps one part, the new node keeps one part

New node u joins the network a ih kjmlon f d g e b c pqrs u h’ d’i’ b’k’ a’m’ c’n’ g’ o’ l’ p’ j’q’ e’ r’ f’s’ t’ u’ n’ u Example

Node departure Similar to BATON except that only routing nodes are considered Only leaf routing nodes whose routing neighbor nodes don’t have full routing children nodes can leave the network Upon leaving, children data nodes of the departed routing node are merged together. The new data node is pulled up to replace the position of the departed routing node. Others have to find a replacement routing node which is a routing node in the first case

r’ f’ Leaf routing node r leaves the network a ih kjmlon f d g e b c pqrs h’ d’i’ b’k’ a’m’ c’n’ g’ o’ l’ p’ j’q’ e’s’ t’ f’ Example

Index construction Each internal node manages a region covering all regions managed by its children Data is stored only at leaf nodes Discrete data: help to avoid updating upside paths frequently Two dimensional index construction

Range query search If the node region covers or intersects with the searched region The query is processed at the node and/or its children If there is any ancestor of a node whose region intersects with the searched region If the node in other side hasn’t been searched before The query is forwarded to that node Else The query is forwarded upward to the ancestor Note: ancestors whose children cover the whole searched region don’t need to be searched

Example Node h wants to search the shaded region ed gf b c a jk e’ a’g’ j’ f’k’ c’ l’ hi h’ d’i’ b’

Load balancing Similar to rotations of AVL tree LL Rotation LR Rotation

Load balancing RR Rotation RL Rotation

Experimental study Experimental setup Implement M-tree over VBI-Tree framework data objects are inserted into a network of nodes exact queries, 1000 range queries, and 1000 kNN queries are executed. CAN is used for comparison.

Performance of point queries Average and maximum hops in different dimensions Average hops in different network sizes

Performance of range and kNN queries Average hops to find range query results Average hops to find kNN query results

Cost of updating upside path vs cost of search Average number of messages for updating upside paths Average number of hops for searching queries

Workload distribution Workload distribution among nodes

Effect of load balancing Average additional number of messages required in case of skewed data distribution Size of load balancing process

Access load Access load for nodes at different levels

Conclusion VBI-Tree A framework capable of supporting a variety of well-tested multi-dimensional indexing methods such as the R-Tree, X-Tree, SSTree, and M-Tree in a P2P system. Introduction of discrete data as a mean to minimize update costs and novel P2P search algorithms that account properly for such discrete data. An AVL-tree like rotation scheme for rebalancing the virtual binary tree when needed, leading to effective load balance even with highly skewed data.

Thank you ! Questions & Answers