Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Structured Overlay Networks
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
A Scalable Content Addressable Network (CAN)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
A Scalable Content Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Shenker Presented by: Ilya Mirsky, Alex.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Distributed Lookup Systems
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Protecting Free Expression Online with Freenet Presented by Ho Tsz Kin I. Clarke, T. W. Hong, S. G. Miller, O. Sandberg, and B. Wiley 14/08/2003.
A Scalable Content- Addressable Network Sections: 3.1 and 3.2 Καραγιάννης Αναστάσιος Α.Μ. 74.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
A Scalable Content-Addressable Network
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
CONTENT ADDRESSABLE NETWORK Sylvia Ratsanamy, Mark Handley Paul Francis, Richard Karp Scott Shenker.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Applied Research Laboratory David E. Taylor A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker.
Sylvia Ratnasamy (UC Berkley Dissertation 2002) Paul Francis Mark Handley Richard Karp Scott Shenker A Scalable, Content Addressable Network Slides by.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Peer-to-Peer Systems. 2 Introduction What is peer One that of equal standing with another Peer-to-peer A way of structure distributed applications Each.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Networks 03 CAN (Content Addressable Network) Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
1 Distributed Hash Tables and Structured P2P Systems Ningfang Mi September 27, 2004.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA
Incrementally Improving Lookup Latency in Distributed Hash Table Systems Hui Zhang 1, Ashish Goel 2, Ramesh Govindan 1 1 University of Southern California.
A Scalable content-addressable network
CONTENT ADDRESSABLE NETWORK
A Scalable, Content-Addressable Network
A Scalable Content Addressable Network
A Scalable, Content-Addressable Network
Presentation transcript:

Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007

ECE 1770 – Content-Addressable Networks High-Level Overview Hash tables (map keys to values) are heavily used in building software applications The concept of a Content-Addressable Network (CAN) provides hash table-like functionality on Internet-like scales. CAN is: Scalable Robust/Fault-tolerant Self-organizing Low-latency

ECE 1770 – Content-Addressable Networks Hash Tables and CAN A data structure that efficiently maps keys onto values CANs are a form of distributed, Internet-scale hash tables.

ECE 1770 – Content-Addressable Networks What CAN would do for us CAN would improve peer-to-peer systems Napster: the process of locating a file is centralized Expensive to scale the central repository, single point of failure Gnutella: decentralized the file location process (network self-organizes into an application layer mesh) Requests for files are done through flooding, not scalable, may not find content Conclusion: P2P systems need a scalable indexing mechanism CAN would improve large data repositories These systems need efficient insertion and retrieval CAN would create large-scale name resolution services that don’t use a naming scheme (ie. Not DNS) No more location-dependent naming schemes

ECE 1770 – Content-Addressable Networks Basic Operations Performed On CANs Basic Operations Insertion (of key,value pairs) Lookup (of key,value pairs) Deletion (of key,value pairs) Each CAN stores 1. A piece (called a zone) of the entire hash table 2. Holds information about a small number of adjacent zones in the table Routing in a CAN Done by intermediate CAN nodes towards the CAN node whose zone contains that key CAN Design is Distributed (requires no centralized control or coordination) Scalable (nodes hold only a small about of information that doesn’t grow with the network) Fault-tolerant (nodes can route around failures) Doesn’t require a naming hierarchy Is entirely Application Layer

ECE 1770 – Content-Addressable Networks CAN Design Centers around a virtual d-dimensional Cartesian coordinate space on a d-torus At any time, the entire coordinate space is dynamically partitioned among all the nodes in the system Each node owns a distinct zone

ECE 1770 – Content-Addressable Networks CAN Design (2) 1. To store a pair, key K 1 is mapped to P via a uniform hash function 2. The pair is then stored at the node that owns the zone where P lies 3. To retrieve an entry corresponding to K 1, any node can apply the same hash function to map K 1 to P and get the corresponding value A node learns and maintains the IP addresses of those nodes that hold adjoining coordinate zones Efficient routing is critical to a useful CAN

ECE 1770 – Content-Addressable Networks Routing in a CAN Routing in a Content Addressable Networks works by following the straight line path through the Cartesian space from source to destination coordinates. A CAN node maintains a coordinate routing table that holds the IP address and virtual coordinate zone of each of its immediate neighbors in the coordinate space. Average Path Length = (d/4)(n 1/d ) Individual Nodes Have 2d Neighbors Average Path Length Grows As O(n 1/d )

ECE 1770 – Content-Addressable Networks Construction of a CAN Overlay The entire CAN space is divided amongst the nodes currently in the system Incremental construction process takes three steps The new node finds a node already in the CAN Using the CAN routing mechanisms, finds a node whose zone will be split The neighbors of the split zone must be notified so that routing can include the new node Bootstrapping: There are CAN bootstrap nodes associated to a DNS domain name Node Insertion Affects Only O(number of dimensions) existing nodes

ECE 1770 – Content-Addressable Networks Maintenance of a CAN Overlay Node Graceful Departure: node explicitly hands over its zone and the associated (key,value) database to one of its neighbors Node Abrupt Disappearance: An immediate takeover algorithm ensures one of the “failed” node’s neighbors takes over the zone Under normal conditions, a node sends periodic update messages to each of its neighbors and a list of neighbors and their zone coordinates. Prolonged absence of an update message from a neighbor signals it’s failure

ECE 1770 – Content-Addressable Networks Design Improvements Basic CAN algorithm provides Low per-node state (O(d) for a d-dimensional space) Short path lengths (O(dn 1/d ) hops for d dimensions and n nodes) The problem is that there are application- layer hops, not IP-layer hops Latency of each hop might be substantial

ECE 1770 – Content-Addressable Networks Design Improvements (2) Improvement: Multi-dimensioned Coordinate Spaces Increasing the dimensions of the CAN coordinate space reduces the routing path length and path latency for a small increase in the size of the coordinate routing table Path Length scales as O(d(n 1/d )) Fault-tolerance improves Improvement: Multiple Coordinate Spaces (a.k.a. Multiple Realities) Maintain multiple independent coordinate spaces with each node in the system being assigned a different zone in the coordinate space (each coordinate space is a reality) Fault-tolerance improves Low per-node state (O(d) for a d-dimensional space) Short path lengths (O(dn 1/d ) hops for d dimensions and n nodes) Which is better? Increasing the dimensions

ECE 1770 – Content-Addressable Networks Design Improvements (3) Improvement: Better CAN Routing Metrics Have each node measure the network-level round-trip-time RTT to each of its neighbors. Then route messages accordingly. Favors lower latency paths and avoids unnecessarily long hops Improvement: Caching and Replication A CAN node can maintain a cache of the data keys it recently accessed A CAN node can replicate the data key at each of its neighboring nodes Both schemes need an associated time-to-live field, to eventually expire from the cache

ECE 1770 – Content-Addressable Networks Related Systems Domain Name System CANs are more general than the DNS because DNS closely ties the naming scheme to the manner in which a name is resolved to an IP address Peer-to-Peer A simple example is keys being analogous to a URL Will improve robustness Key difference is that content within the CAN can always be located by any other node because there is a clear “home” (point) in the CAN for that content and every other node knows what the home is how to reach it

ECE 1770 – Content-Addressable Networks Discussion Security? Better or worse with CAN? Any Other Design Improvement? Is The Communication Overhead Significant?

ECE 1770 – Content-Addressable Networks References A Scalable Content-Addressable Network, Ratnasamy, University of California – Berkeley, ratnasamy.pdfhttp:// ratnasamy.pdf