P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Evaluating scalability Peer-to-Peer File Sharing Networks of Sayantan Mitra Vibhor Goyal.
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
Small-world Overlay P2P Network
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Evaluation of Ad hoc Routing Protocols under a Peer-to-Peer Application Authors: Leonardo Barbosa Isabela Siqueira Antonio A. Loureiro Federal University.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Chapter 3: Data Storage and Access Methods
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Study of the Relationship between Peer to Peer Systems and IP Multicasting From IEEE Communication Magazine January 2003 學號 :M 姓名 : 邱 秀 純.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
KNR-tree: A novel R-tree-based index for facilitating Spatial Window Queries on any k relations among N spatial relations in Mobile environments ANIRBAN.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
HERO: Online Real-time Vehicle Tracking in Shanghai Xuejia Lu 11/17/2008.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
EFFECTIVE LOAD-BALANCING VIA MIGRATION AND REPLICATION IN SPATIAL GRIDS ANIRBAN MONDAL KAZUO GODA MASARU KITSUREGAWA INSTITUTE OF INDUSTRIAL SCIENCE UNIVERSITY.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Fast Searching in Peer-to-Peer Networks Self-Organizing Parallel Search Clusters Rocky Dunlap.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Dynamic P2P Indexing and Search based on Compact Clustering Mauricio Marin Veronica Gil-Costa Cecilia Hernandez UNSL, Argentina Universidad de Chile Yahoo!
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks Debraj De and Lifeng Sang Ohio State University Workshop.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.
Peer to Peer Network Design Discovery and Routing algorithms
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
A Bandwidth Scheduling Algorithm Based on Minimum Interference Traffic in Mesh Mode Xu-Yajing, Li-ZhiTao, Zhong-XiuFang and Xu-HuiMin International Conference.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
NCLAB 1 Supporting complex queries in a distributed manner without using DHT NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee,
Malugo – a scalable peer-to-peer storage system..
Attribute Allocation in Large Scale Sensor Networks Ratnabali Biswas, Kaushik Chowdhury, and Dharma P. Agrawal International Workshop on Data Management.
On Improving the Performance Dependability of Unstructured P2P Systems via Replication ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA Institute of Industrial.
CMSC 691B Multi-Agent System A Scalable Architecture for Peer to Peer Agent by Naveen Srinivasan.
Anirban Mondal (IIS, University of Tokyo, JAPAN)
Peer-to-Peer Data Management
Early Measurements of a Cluster-based Architecture for P2P Systems
SCOPE: Scalable Consistency in Structured P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
A Scalable content-addressable network
Peer-to-Peer Video Services
Dynamic Replica Placement for Scalable Content Delivery
Deterministic and Semantically Organized Network Topology
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Presentation transcript:

P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.

PRESENTATION OUTLINE Motivating Spatial Applications on Motivating Spatial Applications on P2P systems P2P systems Existing Spatial Indexes Existing Spatial Indexes Our proposal: The P2PR-tree Our proposal: The P2PR-tree Performance Analysis Performance Analysis Conclusion and Future Work Conclusion and Future Work

Spatial Applications on P2P systems Spatial data occurs in several important and diverse applications Geographic Information Systems (GIS) Computer-aided design (CAD) Resource management Development planning, emergency planning and scientific research. Unprecedented growth of available spatial data at geographically distributed locations. Trend of increased globalization. Popularity of P2P data sharing Efficient global sharing of distributively owned spatial data in P2P systems

Application example Searching for Real Estate information in Tokyo Query MBR QueryResults

Existing Spatial Indexes Centralized spatial indexes R-tree, R*-tree, R+-tree Distributed spatial indexes M-Rtree MC-Rtree

MC-Rtree R-tree which indexes the covering MBRs of the data stored at the clients Each client has its own R-tree for managing its own data Master client Centralization Centralization Designed for clusters. Designed for clusters. Optimize disk I/Os. Optimize disk I/Os.

Why can’t we use existing R-tree-based approaches? They use centralized mechanisms They use centralized mechanisms → not scalable. → not scalable. All updates must pass through Master Node All updates must pass through Master Node All searches need to be routed by the Master Node All searches need to be routed by the Master Node → Performance bottleneck at the Master Node They do not optimize communication time. They do not optimize communication time.

GRID-Related Projects GRID Physics Network and European DataGrid Improving scientific research which require efficient distributed handling of data in the petabyte range, Earth Systems GRID (ESG) aims at facilitating detailed analysis of huge amounts of climate data by a geographically distributed community via high bandwidth networks. NASA Information Power GRID (IPG) improve existing systems in NASA for solving complex scientific problems efficiently

How our proposal differs from GRID-related spatial works? GRID Restrict data sharing only among scientific and research organizations Individual nodes are usually dedicated and expected to be available most of the time. Some amount of centralized control is possible by collaborations between organizations. Our proposal Allow normal users to share/upload data. Individual nodes may join/leave anytime. Distributively owned peers, hence centralized control practically challenging.

Existing Search mechanisms in P2P systems Broadcast (Gnutella) Broadcast (Gnutella) Centralized (Napster) Centralized (Napster) Routing indices (RIs) Routing indices (RIs) Distributed hash tables (Chord,CAN,Tapestry) Distributed hash tables (Chord,CAN,Tapestry) Existing works on P2P systems mostly address file-sharing.

P2PR-tree (Peer-to-Peer R-tree) A distributed R-tree-based indexing scheme designed for P2P systems Parts of the distributed indexes are built autonomously by each peer. Hierarchical and performs efficient pruning. Completely decentralized Highly Scalable

Block 1Block 2 Block 3 Block 4 Dividing the Universe P5 P6 P1 P2 P4 P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P P Level 2 B1B2B3B4G1G2G3G4 P5P6P3 P1P2P20P3P4 SG1SG2 Level 0 Level 1 Level 3 ….. P20P3

Definitions Unit: A Block, Group, Subgroup at any level, or a peer Unit: A Block, Group, Subgroup at any level, or a peer UnitMBR: Minimum Bounding Rectangle of a Unit UnitMBR: Minimum Bounding Rectangle of a Unit Router: In order to route messages to a Unit X, a peer A needs to know at least one peer (say peer B) which belongs to Unit X. We define peer B as Peer A’s Router to Unit X. Router: In order to route messages to a Unit X, a peer A needs to know at least one peer (say peer B) which belongs to Unit X. We define peer B as Peer A’s Router to Unit X. UnitRouterInfo: The addresses of routers to a Unit UnitRouterInfo: The addresses of routers to a Unit UnitInfo: UnitMBR and UnitRouterInfo of a Unit UnitInfo: UnitMBR and UnitRouterInfo of a Unit ChildInfo (Level i): UnitInfo of Child Units at Level i+1 in the P2PR-tree ChildInfo (Level i): UnitInfo of Child Units at Level i+1 in the P2PR-tree

Data Structure at a peer A Peer of Level L can be specified as maintains the following information where

Example of Data Structure Level 2 Units B1B2B3B4G1G2G3G4P5P6P3 P1P2P20P3P4 SG1SG2 Level 0 Units Level 1 Units Level 3 Units... P2 can be specified as Peer( ) G1G2G3G4 P11P12P21P33P66 SG1SG2

B1B2B3B4G1G2G3G4 P5P6P3 P1P2P4P3 Level 0 Level 1 Level 2 ….. Maintaining information Peer Level = 2, (B1,B2,B3,B4) (G1,G2,G3,G4) (P6,P3) P5 P6 P1 P2 G1G2 G3 G4 P9 P10 P8 P2PR-tree P3 P4 Block 1 BlockMBR information stored at every peer

B1B2B3B4G1G2G3G4 P5P6P3 P1P2P4P3 Level 0 Level 1 Level 2 ….. P5 P6 P1 P2 G1G2 G3 G4 P9 P10 P8 P4 P3 P2PR-tree Block 1 Maintaining information Peer Level = 2, (B1,B2,B3,B4) (G1,G2,G3,G4) (P6,P3) BlockMBR information stored at every peer

B1B2B3B4G1G2G3G4 P5P6P3 P1P2P3P4 BlockMBR information stored at every peer Level 0 Level 1 Level 2 ….. Maintaining information Peer Level = 2, (B1,B2,B3,B4) (G1,G2,G3,G4) (P2,P3,P4) P30 P5 P6 P1 P2 P20 SG1 SG2 G1G2 G3 G4 P9 P10 P8 P4 P30 P3 Peer Join operation in P2PR-tree Block 1

P5 P6 P1 P2 P20 SG1 SG2 G1G2 G3 G4 P9 P10 P8 P4 P30 P3 Level 2 B1B2B3B4G1G2G3G4 P5P6P3P30 P1P2P20P3P4 SG1SG2 BlockMBR information stored at every peer Level 0 Level 1 Level 3 ….. Maintaining information Peer Level = 3, (B1,B2,B3,B4) (G1,G2,G3,G4), (SG1,SG2), (P2,P20) Peer Join operation in P2PR-tree Block 1

Routing Issues Assumption: A peer initially knows at least N routers for a Unit. Assumption: A peer initially knows at least N routers for a Unit. Piggybacking to refresh routers for each peer. Piggybacking to refresh routers for each peer. During piggybacking, a peer sends the addresses and reliability information of other peers in its own Unit. During piggybacking, a peer sends the addresses and reliability information of other peers in its own Unit. Each peer maintains most reliable R routers for Units based on reliability. Each peer maintains most reliable R routers for Units based on reliability. What if all routers that a peer knows in a specific Unit are unavailable? What if all routers that a peer knows in a specific Unit are unavailable? Peer contacts the peers in other blocks to find out new routers for that block. Peer contacts the peers in other blocks to find out new routers for that block.

Example of refreshing routers P5 P6 P1 P2 G1G2 G3 G4 P9 P10 P8 P4 P3 Block 1 P11 P9,P15→G4 P10,P12→G4 P12 P15 P9,P15→G4 P10,P12→G4 P9,P15→G4 N=2, R=4

Example of refreshing routers P5 P6 P1 P2 G1G2 G3 G4 P9 P10 P8 P4 P3 Block 1 P11 P9,P15→G4 P10,P12→G4 P12 P15 P9,P15→G4 P10→G4 P10,P12→G4 P9→G4 N=2, R=3

Level 2 B1B2B3B4 G1G2G3G4 P5P6P3P30 P1P2P20P3P4 SG1SG2 BlockMBR information stored at every peer Level 0 Level 1 Level 3 … Maintaining Information Peer Level = 2 (P5→B1, P25→B2, P35→B3, B4) (P41→G1, G2, P43→G3, P49→G4) (P45, P46) Searching the P2PR-tree P5 P6 P1 P2 P20 SG1 SG2 G1 G2 G3 G4 P9 P10 P8 P4 P30 P3 Block 1 P45 P46 P41 P42 G1 G2 G3 G4 P49 P40 P48 P44 P60 P43 Block 4 G1G2G3G4 P45P46P60 Query Level = 0 Query comes to P60 Maintaining Information Peer Level = 2 (P5→B1, P25→B2, P35→B3, B4) (P41→G1, G2, P43→G3, P49→G4) (P45, P46) B1

Level 2 B1B2B3B4 G1G2G3G4 P5P6P3P30 P1P2P20P3P4 SG1SG2 BlockMBR information stored at every peer Level 0 Level 1 Level 3 … Maintaining Information Peer Level = 2 (B1, P26→B2, P36→B3, P42→B4) (P4→G1, G2, P8→G3, P9→G4) (P6, P30) Searching the P2PR-tree P5 P6 P1 P2 P20 SG1 SG2 G1 G2 G3 G4 P9 P10 P8 P30 P3 Block 1 P45 P46 P41 P42 G1 G2 G3 G4 P49 P40 P48 P44 P60 P43 Block 4 G1G2G3G4 P45P46P60 Query Level = 1 Query comes to P60 G1 Maintaining Information Peer Level = 2 (B1, P26→B2, P36→B3, P42→B4) (P4→G1, G2, P8→G3, P9→G4) (P6, P30) P4

Level 2 B1B2B3B4 G1G2G3G4 P5P6P3P30 P1P2P20P3P4 SG1SG2 BlockMBR information stored at every peer Level 0 Level 1 Level 3 … Searching the P2PR-tree P5 P6 P1 P2 SG1 SG2 G1 G2 G3 G4 P9 P10 P8 P30 P3 Block 1 P45 P46 P41 P42 G1 G2 G3 G4 P49 P40 P48 P44 P60 P43 Block 4 G1G2G3G4 P45P46P60 Query Level = 2 Query comes to P60 P4 Maintaining Information Peer Level = 3 (B1, P27→B2, P37→B3, P43→B4) (G1, P6→G2, P8→G3, P10→G4) (P20→SG1, SG2) (P3) Maintaining Information Peer Level = 3 (B1, P27→B2, P37→B3, P43→B4) (G1, P6→G2, P8→G3, P10→G4) (P20→SG1, SG2) (P3) SG1 P20

Level 2 B1B2B3B4 G1G2G3G4 P5P6P3P30 P1P2 P20P3P4 SG1SG2 BlockMBR information stored at every peer Level 0 Level 1 Level 3 … Searching the P2PR-tree P5 P6 SG1 SG2 G1 G2 G3 G4 P9 P10 P8 P30 P3 Block 1 P45 P46 P41 P42 G1 G2 G3 G4 P49 P40 P48 P44 P60 P43 Block 4 G1G2G3G4 P45P46P60 Query Level = 3 Query comes to P60 P4 Maintaining Information Peer Level = 3 (B1, P28→B2, P38→B3, P45→B4) (G1, P30→G2, P8→G3, P9→G4) (SG1, P3→SG2) (P1,P2) P20 Maintaining Information Peer Level = 3 (B1, P28→B2, P38→B3, P45→B4) (G1, P30→G2, P8→G3, P9→G4) (SG1, P3→SG2) (P1, P2) P1P2 P1 P2

Performance Evaluation Investigates the following Investigates the following Effect of variations in workload skew Effect of variations in workload skew Performance metric: Performance metric: Average Response Time Average Response Time Comparison with Centralized MC-Rtree Comparison with Centralized MC-Rtree 1000 data providing peers 1000 data providing peers

Effect of variations in workload skew when the query interarrival rate was fixed at 20 queries/second

Effect of variations in workload skew when the query interarrival rate was fixed at 100 queries/second

Conclusion Investigation of the problem of spatial indexing in P2P environments. Proposal of the P2PR-tree (Peer-to-Peer R-tree). Scalable decentralized P2P data structure Efficient routing scheme

Future Scope of Work Detailed simulation Detailed simulation Replication Replication Availability Availability Load-balancing Load-balancing