Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

P2PR-tree: An R-tree-based Spatial Index for P2P Environments ANIRBAN MONDAL YI LIFU MASARU KITSUREGAWA University of Tokyo.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Scalable Content-Addressable Network Lintao Liu
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
A Scalable Content Addressable Network (CAN)
1 Distributed Hash Tables My group or university Peer-to-Peer Systems and Applications Distributed Hash Tables Peer-to-Peer Systems and Applications Chapter.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
CSc 461/561 CSc 461/561 Peer-to-Peer Streaming. CSc 461/561 Summary (1) Service Models (2) P2P challenges (3) Service Discovery (4) P2P Streaming (5)
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Aggregating Information in Peer-to-Peer Systems for Improved Join and Leave Distributed Computing Group Keno Albrecht Ruedi Arnold Michael Gähwiler Roger.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
Object Naming & Content based Object Search 2/3/2003.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Topology-Aware Overlay Networks By Huseyin Ozgur TAN.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
Multi-level Hashing for Peer-to-Peer System in Wireless Ad Hoc Environment Dewan Tanvir Ahmed and Shervin Shirmohammadi Distributed & Collaborative Virtual.
CONTENT ADDRESSABLE NETWORK Sylvia Ratsanamy, Mark Handley Paul Francis, Richard Karp Scott Shenker.
Introduction of P2P systems
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
A Spatial Index Structure for High Dimensional Point Data Wei Wang, Jiong Yang, and Richard Muntz Data Mining Lab Department of Computer Science University.
CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
CHAPTER 3 Architectures for Distributed Systems
A Scalable content-addressable network
A Scalable Content Addressable Network
Presentation transcript:

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer Science Department University of Southern California Los Angeles, CA COMPSAC 2004

Roger ZimmermannCOMPSAC 2004, September 30 Outline Motivation Introduction to DHTs (CAN) Technical Approach Results Conclusions and Future Research

Roger ZimmermannCOMPSAC 2004, September 30 Motivation Spatial data sets are used for many applications, e.g., GIS, CAD, … P2P systems provide a distributed platform that is very scalable. Pros: –Scalability, no central point of failure Cons: –Very dynamic (unreliable), topology maintenance required

Roger ZimmermannCOMPSAC 2004, September 30 Motivaton (cont.) Question: how to use P2P systems for spatial data sharing. Query Challenges: –Unstructured P2P systems: querying by flooding is not efficient –Structured P2P systems based on DHTs (Chord, CAN): only efficient exact match queries are supported E.g., search files based on their names/titles put(key, value); get(key) return value

Roger ZimmermannCOMPSAC 2004, September 30 Motivation (cont.) Spatial queries are usually range queries –Intersect, overlap –Nearest neighbor(s) (kNN) DHTs are not suitable without modification

Roger ZimmermannCOMPSAC 2004, September 30 Distributed Hash Tables (DHT) DHT systems: Content Addressable Network (CAN), Chord, Pastry, etc. Using DHT to allocate large data sets to many nodes with no central control Data objects are near uniformly distributed through a hash function, resulting in superb scalability and load balance Each node only maintains a small routing table to know its neighbors Locating a particular data object requires O(logN) search steps on average

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (CAN) A scalable indexing mechanism in a P2P network Creates a logical d-dimensional Cartesian coordinate space Divides the space into zones, where each zone is controlled by a node in the system Zones are dynamically partitioned or merged as nodes join and leave Each Zone is addressed with a Virtual Identifier (VID), which is deterministically calculated from the location of the zone

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (CAN) Example: A 2-D space partitioned into 7 CAN zones

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (cont) Node Operations Node Operations (e.g., Insertion) (e.g., Insertion) 1.Find a bootstrap node first

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (cont) 2. Randomly choose a point in the CAN plane and route the new node from the bootstrap node to the chosen location

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (cont) 3. The new node arrives at the destination zone covering that point. The destination zone is split into two zones, each controlled by one node (old and new)

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (cont) 4. Update the neighborhood zone routing information

Roger ZimmermannCOMPSAC 2004, September 30 Content Addressable Network (CAN) Data Object Operation (e.g. Insertion) 1.Generate a key based on the object identification and insert data object as a pair 2.Map the key into a point P in the CAN plane by using a uniform hash function 3.Store the pair at the node which owns the zone within which the point P is located 4.To retrieve the value, the same hash function is applied to the key in order to regenerate the point P and find the zone owns that point, the zone will return the value to the client

Roger ZimmermannCOMPSAC 2004, September 30 Storing Spatial Data w/ DHTs Hash function distributes data objects evenly within the space to achieve a balanced load Spatial locality information needs to be preserved for range queries. Applying a hash function to spatial data will destroy locality Related work explored storing R-tree or Quad- tree based index on DHT –Harwood et al. Hashing Spatial Content over Peer-to- Peer Networks –Mondal et al. P2PR-tree: An R-tree-based Spatial Index for Peer-to-Peer Environments

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems Mapping a physical space to a CAN space –Propose a new hash function to map spatial data objects onto nodes over a modified CAN system –Purpose: allow efficient spatial data query execution while at the same time considering load balance –Calculating the location of zones in the logical space – Virtual Identifier (VID) tree for mapping purpose

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems Approach: –Object key is generated with three different components (a) Scatter region address: based on the spatial locality of the object; preserves spatial locality. (b) Zone address: randomized; achieves load balance (c) Object identifier (hashed) –The scatter region size is fixed and predetermined

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) –The value of zone bit string is decided randomly and the object identifier is the data content hash result –The VID tree is created with its height determined by the scatter region size –The maximum number of zones is 2 (a+b) –The relationship between data locality and load balance can be determined along a spectrum

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Scatter region (11000) e.g.: a=5 bits

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Zones e.g.: b=4 bits

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) System Operation and Spatial Range Query –Node Operation Bootstrap mechanism Node join mechanism Zone split and the search threshold –Balance the number of data objects in each zone –The zone being selected must be larger than the minimum zone size (1/2 (a+b) ) –The threshold is the upper bound on the number of search hops to find a zone to split –Data Object Insertion –Data Object Deletion –Spatial Range Query

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Spatial Range Query Step 1: The querying node launches a spatial range query.

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Spatial Range Query Step 2: The node determines the overlapping scatter regions.

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Spatial Range Query Step 3: The node multicasts the query to the overlapping scatter regions.

Roger ZimmermannCOMPSAC 2004, September 30 Spatial Range Query Design for P2P Systems (cont.) Step 4: –The range query is multicast within all overlapping scatter regions (M-CAN). –Recall: data is randomized within each scatter region, so an exhaustive search is necessary –Choice of scatter region size Large: good load balance; uniform within a scatter region Small: exhaustive search covers less area

Roger ZimmermannCOMPSAC 2004, September 30 Conclusions and Future Research Directions –We proposed a hash function to preserve both spatial locality information and constrained load balance –The proposed mechanism works will with CAN P2P architecture –We are currently running simulations to test our approach

Roger ZimmermannCOMPSAC 2004, September 30 Thank you! Questions?