Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)

Slides:



Advertisements
Similar presentations
Semantic Small World (SSW) An Overlay Network and Index Structure for Semantic based P2P Search Presented by : Raj Kumar Rasam.
Advertisements

CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
PDPTA03, Las Vegas, June S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord Valentin Mesaros 1, Bruno Carton 2, and Peter Van Roy 1 1.
Lecture 5 - Routing On the Flat Labels M.Sc Ilya Nikolaevskiy Helsinki Institute for Information Technology (HIIT)
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
A Scalable Content Addressable Network (CAN)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Distributed Lookup Systems
A Scalable Content- Addressable Network Sections: 3.1 and 3.2 Καραγιάννης Αναστάσιος Α.Μ. 74.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
Beacon Vector Routing: Scalable Point-to-Point Routing in Wireless Sensornets.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
Caching and Data Consistency in P2P Dai Bing Tian Zeng Yiming.
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
Handling Spatial Data In P2P Systems Verena Kantere, Timos Sellis, Yannis Kouvaras.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
A Mechanized Model for CAN Protocols Context and objectives Our mechanized model Results Conclusions and Future Works Francesco Bongiovanni and Ludovic.
Neighborhood-Based Topology Recognition in Sensor Networks S.P. Fekete, A. Kröller, D. Pfisterer, S. Fischer, and C. Buschmann Corby Ziesman.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1 Presented by Jing Sun Computer Science and Engineering Department University of Conneticut.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Parallel and Distributed Simulation Data Distribution II.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord
KD Tree A binary search tree where every node is a
A Scalable content-addressable network
Reading Report 11 Yin Chen 1 Apr 2004
Data Mining CSCI 307, Spring 2019 Lecture 23
Presentation transcript:

Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)

Thomas ZahnCST2 Motivation  E.g. find all object whose attribute values (NOT hash IDs!!) are between 100 and 200  DHTs poorly support range queries  Due to hashing, semantically succeeding objects could be stored at "opposite" ends of the overlay  for each value in range, a separate lookup needs to be issued

Thomas ZahnCST3 Overlay Object Placement d46a1c 3102ab d1a08e h(15) = d1a08e h(16) = 3102ab h(17) = d46a1c

Thomas ZahnCST4 Problem  for each value in range, a separate lookup would have to be issued  while theoretically possible for discrete sets (e.g. [10,11,12,…,50] )  completely impossible for continuous sets (e.g. [10.0, 50.0])

Thomas ZahnCST5 General Concept (1)  uses 2-dimensional CAN virtual space  virtual space is partitioned into rectangular zones  each zone is owned by an active node  each node maintains RT with its neighbors 4 7 (20,20) (80,20) (20,80)(80,80)

Thomas ZahnCST6 General Concept (2)  node stores results of queries whose range are hashed to its zone  range query hashed to target point (a,b)  target zone, target node  result of range query is stored at target node/zone 4 7 (20,20) (80,20) (20,80)(80,80)  e.g. range query

Thomas ZahnCST7 General Concept (3)  given two range queries r1: and r2:  two target points t1 (r1) and t2 (r2) 1.if a1 < a2  t1 lies to the left of t2 2.if b1 < b2  t1 lies below t2 3.t1 lies to the upper-left of t2 iff range r1 contains range r2

Thomas ZahnCST8 General Concept (4)  range query hashed into zone A  if any prior range query result containing exists  must have been hashed to point in shaded region  any intersecting zone can potentially contain a result C D B (x,y) A

Thomas ZahnCST9 General Concept (5)  two target points t1 (r1) and t2 (r2)  t1 lies to the upper-left of t2 iff range r1 contains range r2  Diagonal Zone: zone z (x1,y1),(x2,y2), zone z' (a1,b1),(a2,b2) z' is diagonal zone of z if a2 ≤ x1 and b1 ≥ y2 Intuitively: z' is diagonally above upper-left corner of z  only non-empty zones exist  a diagonal zone of z can answer ALL range queries that hash into z C z' B z

Thomas ZahnCST10 Zone Maintenance  initially entire hash space is single zone assigned to one active node  each active node has RT containing its neighbor active nodes along with their zone coordinates  a zone splits when load (storage and/or processing) too high  decision made by zone owner  owner contacts a passive node  assigns it portion of its zone  transfer corresponding results, neighbor list

Thomas ZahnCST11 Query Routing (1)  result likely to be cached at target zone  range query is routed through virtual space toward its target zone  starting at requesting zone, each zone passes query on to a neighboring zone  a zone chooses neighbor zone whose coordinates are closest to target point  process continues until target zone is reached

Thomas ZahnCST12 Query Routing (2)  simple way: compute Euclidean distance between target point and center of a zone  might not converge t

Thomas ZahnCST13 Query Routing (3)  distance of target t from a zone Z should be measured as the closest distance of t from the entire zone R5R1R6 R3ZR4 R7R2R8

Thomas ZahnCST14 Forwarding (1)  query reaches target zone  check local cache  if no result containing query range is found  forward query  only zones to upper left of target point can have a result containing the given range  forwarding similar to flooding  Forward Limit 0.0 – 1.0

Thomas ZahnCST15 Forwarding (2)  Again: diagonal zones are especially interesting  guaranteed to have a result containing the given range  Because: every point in the diagonal zone contains the range  every point lies to upper-left of target point  BUT: zone may not have a diagonal zone

Thomas ZahnCST16 Updates  tuple t with range attribute A=k is updated  sent update message to target zone containing (k,k)  tuple t included in all ranges s.th. a ≤ k ≤ b  forward to all zones that lie on the upper left of target zone

Thomas ZahnCST17 Conclusion  Does not assume natural equal distribution of attribute values  Efficient average path length (O( ) )  BUT: hot spot nodes in upper-left section  many splits  heavy partitioning  longer path length  cached results may not reflect current result  updates / deletion expensive

Thomas ZahnCST18 Questions ?