Scalable Content-Addressable Network Lintao Liu 2001.11.19.

Slides:



Advertisements
Similar presentations
Topology-Aware Overlay Construction and Server Selection Sylvia Ratnasamy Mark Handley Richard Karp Scott Shenker Infocom 2002.
Advertisements

CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
PDPTA03, Las Vegas, June S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord Valentin Mesaros 1, Bruno Carton 2, and Peter Van Roy 1 1.
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
A Scalable Content Addressable Network (CAN)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
A Scalable Content Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker Presented by: Ilya Mirsky, Alex.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Distributed Lookup Systems
A Scalable Content- Addressable Network Sections: 3.1 and 3.2 Καραγιάννης Αναστάσιος Α.Μ. 74.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
A Scalable Content-Addressable Network
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
CONTENT ADDRESSABLE NETWORK Sylvia Ratsanamy, Mark Handley Paul Francis, Richard Karp Scott Shenker.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Applied Research Laboratory David E. Taylor A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker.
Sylvia Ratnasamy (UC Berkley Dissertation 2002) Paul Francis Mark Handley Richard Karp Scott Shenker A Scalable, Content Addressable Network Slides by.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
1 Distributed Hash Tables and Structured P2P Systems Ningfang Mi September 27, 2004.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Improving and Generalizing Chord
Plethora: Infrastructure and System Design
Zhichen Xu, Mallik Mahalingam, Magnus Karlsson
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
A Scalable content-addressable network
CONTENT ADDRESSABLE NETWORK
A Scalable, Content-Addressable Network
Reading Report 11 Yin Chen 1 Apr 2004
A Scalable Content Addressable Network
A Scalable, Content-Addressable Network
Presentation transcript:

Scalable Content-Addressable Network Lintao Liu

System Goals CAN: A distributed infrastructure that provides hash table-like functionality on Internet-like scales. Scalable Fault-tolerant Self-organizing

Basic Design Basic Idea: A virtual d-dimensional Coordinate space Each node owns a Zone in the virtual space Data is stored as (key, value) pair Hash(key) --> a point P in the virtual space (key, value) pair is stored on the node within whose Zone the point P locates

Basic Design For routing purpose, each node only need to maintain the information of those nodes that hold coordinate zone adjoining its own zone (neighbors)

Basic Design Routing: greedy algorithm if P is within the Zone of current node, return (key, value) or failure (if no such key) else forward the query to the neighbor with coordinates closest to P

Example: Routing (4, 0) 4) (0, 0) (0,(4,4) 7

Basic Design Node Insertion A new node N1 is going to join the network: 1. Find a node N2 already in the CAN 2. Randomly choose one point P in the space 3. Send a JOIN request destined for P (P resides in the Zone of N3) 4. N3 splits its Zone and assigns half zone to N1, and send (key, values) pairs from the half zone to N1

Basic Design 5. N1 also gets the information of neighbors from N3 6. N3 notices all the neighbors the reallocation of space. 7. Neighbors change their corresponding data

Example: Insertion

Basic Design Node departure 1. Explicit departure Hand over its zone to another node to produce a valid single zone or merge with a smallest zone

Basic Design 2. Node failure Periodic update messages between Neighbors Prolonged absence of an update message from a neighbor indicates its failure A takeover mechanism merges the zone with the smallest adjacent zone There is also a background zone-reassignment algorithm to smooth the zone allocation.

Example: Node Departure

Problems in Basic design Scalable? Fault-tolerant? Self-organizing? Data durable? Efficient? Some others?

Design Improvements Guidelines: Reduce path latency Increase fault tolerance Increase data availability Without increasing much complexity

Design Improvements Multi-dimensioned coordinate spaces -- reduce average path length Increase the dimensions of the virtual space => reduce the routing path length => reduce the path latency (increase the size of the routing table for there are more neighbors)

Design Improvements Multiple Realities: Multiple coordinate spaces -- improve data availability improve routing fault tolerance reduce the average path length Multiple coordinate spaces exist at the same time Each space is called a “reality” Each node occupies a zone in each reality

Design Improvements Better CAN routing metrics -- reduce per-hop latency When there are more than one choice for forwarding, choose the neighbor with the least RTT.

Design Improvements Overload coordinate zones -- reduce average path length reduce the per-hop latency improve fault tolerance Assign more than one node to share the same zone

Design Improvements Topologically-sensitive construction of CAN -- reduce the path latency A set of machines act as landmarks on Internet Each node measures its RTT to each of these landmarks and orders them in order of RTT. Physically close nodes are likely to have the same ordering and consequently, and will reside in adjacent zones of the coordinate space.

Other Design Improvements Multiple hash function -- increase data availability reduce the query latency Uniform Partitioning -- achieve load balance Caching and Replication -- increase data availability reduce query latency achieve load balance

Comparisons Scalability? Construction & Storage Overhead? Query Efficiency? Fault Tolerance? Complexity? Gnutella, FreeNet, Past, Chord

Conclusions CAN provides scalable routing and efficient indexing. Given a key, it can return the (key, value) pair with an average path length (d/4)(n 1/d ) hops, or return failure if no such (key, value). CAN is completely self-organizing, fault- tolerant and resistant to DoS attack.