Peer-to-Peer Networks and Distributed Hash Tables

Peer-to-Peer Networks and Distributed Hash Tables
2006

Peer-peer networking file sharing:
files are stored at the end user machines (peers) rather than at a server (C/S), files are transferred directly between peers. leverage: P2P is a way to leverage vast amounts of computing power, storage, and connectivity from personal computers (PC) distributed around the world. Q: What are the new technical challenges? Q: What new services/applications enabled? Q: Is it just “networking at the application-level”? Everything old is new again?

Napster Naptser -- free music over the Internet
Key idea: share the content, storage and bandwidth of individual (home) users Model: Each user stores a subset of files; Each user has access (can download) files from all users in the system Application-level, client-server protocol (index server) over point-to-point TCP How does it work -- four steps: Connect to Napster index server Upload your list of files (push) to server. Give server keywords to search the full list with. Select “best” of correct answers. (pings) Internet

Napster: Example m5 E m6 E? E F D m1 A m2 B m3 C m4 D m5 E m6 F m4 E?
(machine) m2

Napster characteristics
Advantages: Simplicity, easy to implement sophisticated search engines on top of the index system centralized index server: single logical point of failure can load balance among servers using DNS rotation potential for congestion Napster “in control” (freedom is an illusion) no security: passwords in plain text no authentication no anonymity

Main Challenge Find where a particular file is stored
Scale: up to hundred of thousands or millions of machines 7/2001: # simultaneous online users: Napster-160K, Gnutella-40K, Morpheus-300K Dynamicity: machines can come and go any time A B C D E F E?

Gnutella peer-to-peer networking: peer applications
Focus: decentralized method of searching for files How to find a file: flood the request Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages: Totally decentralized, highly robust Disadvantages: Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)

Gnutella: Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… m5 E m6 F E D E? m4 E? E? E? C A B m3 m1 m2

Gnutella What we care about: How much traffic does one query generate?
how many hosts can it support at once? What is the latency associated with querying? Is there a bottleneck? late 2000: only 10% of downloads succeed 2001: more than 25% downloads successful (is this success or failure?)

BitTorrent BitTorrent (BT) is new generation p2p. It can make download more faster The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece Swarming: Parallel downloads among a mesh of cooperating peers Scalable - capacity increases with increase in number of peers/downloaders Efficient - it utilises a large amount of available network bandwidth Tracker a central server keeping a list of all peers participating in the swarm (Handles peer discovery)

BitTorrent…. A picture.. Tracker Uploader/downloader

BitTorrent…. A picture..

Freenet Addition goals to file location:
Provide publisher anonymity, security Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines Architecture: Each file is identified by a unique identifier Each machine stores a set of files, and maintains a “routing table” to route the individual requests

Data Structure Each node maintains a common stack id – file identifier
next_hop – another node that store the file id file – file identified by id being stored on local node Forwarding: Each message contains the file id it is referring to If file id stored locally, then stop; If not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop … id next_hop file

Query API: file = query(id); Upon receiving a query for document id
Check whether the queried file is stored locally If yes, return it If not, forward the query message Notes: Each query is associated a TTL that is decremented each time the query message is forwarded; to obscure distance to originator: TTL can be initiated to a random value within some bounds When TTL=1, the query is forwarded with a finite probability Each node maintains the state for all outstanding queries that have traversed it  help to avoid cycles When file is returned, the file is cached along the reverse path

Query Example Note: doesn’t show file caching on the reverse path
1 9 n3 f9 4 n1 f4 12 n2 f12 5 n3 4’ 4 n4 n5 14 n5 f14 13 n2 f13 3 n6 5 4 n1 f4 10 n5 f10 8 n6 2 3 n3 3 n1 f3 14 n4 f14 5 n3 Note: doesn’t show file caching on the reverse path

Insert API: insert(id, file); Two steps
Search for the file to be inserted If not found, insert the file Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator Insertion Follow the forward path; insert the file at all nodes along the path A node probabilistically replace the originator with itself; obscure the true originator

Insert Example Assume query returned failure along “blue” path; insert f10 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 3 n1 f3 14 n4 f14 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 11 n5 f11 8 n6 n5 insert(10, f10)

Insert Example 10 n1 f10 4 n1 f4 12 n2 3 n1 f3 14 n4 f14 5 n3
insert(10, f10) 9 n3 f9 n2 orig=n1

Insert Example n2 replaces the originator (n1) with itself 10 n1 f10
insert(10, f10) orig=n2

Insert Example n2 replaces the originator (n1) with itself
Insert(10, f10) n1 n2 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 n4 n5 10 n4 f10 14 n5 f14 13 n2 10 n4 f10 4 n1 f4 11 n5 n3 10 n2 f10 3 n1 f3 14 n4

Freenet Summary Advantages Provides publisher anonymity
Totally decentralize architecture  robust and scalable Resistant against malicious file deletion Disadvantages Does not always guarantee that a file is found, even if the file is in the network

Solutions to the Location Problem
Goal: make sure that an item (file) identified is always found indexing scheme: used to map file names to their location in the system Requires a scalable indexing mechanism Abstraction: a distributed hash-table data strctr insert(id, item); item = query(id); Note: item can be anything: a data object, document, file, pointer to a file… Proposals CAN, Chord, Kademlia, Pastry, Viceroy, Tapestry, etc

Internet-scale hash tables
essential building block in software systems Internet-scale distributed hash tables equally valuable to large-scale distributed systems? peer-to-peer systems Napster, Gnutella, Groove, FreeNet, MojoNation… large-scale storage management systems Publius, OceanStore, PAST, Farsite, CFS ... mirroring on the Web Content-Addressable Network (CAN) scalable operationally simple good performance

Content Addressable Network (CAN): basic idea
Interface insert(key,value) key (id), value (item) value = retrieve(key) K V K V K V K V K V K V K V K V K V K V insert (K1,V1) K V

CAN: basic idea (K1,V1) retrieve (K1) K V K V K V K V K V K V K V K V

CAN: basic idea Associate to each node and item a unique id in an d-dimensional Cartesian space key (id) - node/point – zone (d) Goals Scales to hundreds of thousands of nodes Handles rapid arrival and failure of nodes Properties Routing table size O(d) Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes

CAN: solution virtual d-dimensional Cartesian coordinate space
entire space is partitioned amongst all the nodes every node “owns” a zone in the overall space abstraction can store data at “points” in the space can route from one “point” to another point = node that owns the enclosing zone

CAN Example: Two Dimensional Space
Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example: Node n1:(1, 2) first node that joins  cover the entire space 7 6 5 4 3 n1 2 1 1 2 3 4 5 6 7

Node n2:(4, 2) joins  space is divided between n1 and n2 7 6 5 4 3 n1 n2 2 1 1 2 3 4 5 6 7

Node n3:(3, 5) joins  space is divided between n1 and n3 7 6 n3 5 4 3 n1 n2 2 1 1 2 3 4 5 6 7

Nodes n4:(5, 5) and n5:(6,6) join 7 6 n5 n4 n3 5 4 3 n1 n2 2 1 1 2 3 4 5 6 7

Simple example: To store a pair (K1,V1)
key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address) Node I::insert(K,V) (1) a = hx(K) b = hy(K) route(K,V) --> (a,b) (3) (a,b) stores (K,V) I y=b x=a

Each item is stored by the node who owns its mapping in the space Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5); 7 6 n5 n4 n3 5 f4 4 f1 3 n1 n2 2 f3 1 f2 1 2 3 4 5 6 7

Simple example: To retrieve key K1
Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies node J::retrieve(K) (1) a = hx(K) b = hy(K) (2) route “retrieve(K)” to (a,b) (K,V) J y=b x=a

CAN: Query/Routing Example
Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f4 A node only maintains state for its immediate neighboring nodes Can route around some failures 7 6 n5 n4 n3 5 f4 4 f1 3 n1 n2 2 f3 1 f2 1 2 3 4 5 6 7

CAN: node insertion Inserting a new node affects only a single other node and its immediate neighbors 1) discover some node “I” already in CAN 2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half (p,q) I J new node new

CAN: Node Failure Recovery
Simple failures Know your neighbor’s neighbors When a node fails, one of its neighbors takes over its zone More complex failure modes Simultaneous failure of multiple adjacent nodes Scoped flooding to discover neighbors Hopefully, a rare event Only the failed node’s immediate neighbors are required for recovery

Evaluation Scalability Low-latency Load balancing Robustness

CAN: scalability For a uniformly partitioned space with n nodes and d dimensions per node, number of neighbors is 2d average routing path is (dn1/d)/4 hops simulations show that the above results hold in practice Can scale the network without increasing per-node state Chord/Plaxton/Tapestry/Buzz log(n) neighbors with log(n) hops

CAN: low-latency Problem
latency stretch = (CAN routing delay) (IP routing delay) application-level routing may lead to high stretch Solution increase dimensions, realities (reduce the path length) Heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries

CAN: low-latency #dimensions = 2 Latency stretch #nodes w/o heuristics
w/ heuristics Latency stretch 16K 32K 65K 131K #nodes

CAN: low-latency #dimensions = 10 Latency stretch #nodes
w/o heuristics w/ heuristics Latency stretch 16K 32K 65K 131K #nodes

CAN: load balancing Two pieces Dealing with hot-spots
popular (key,value) pairs nodes cache recently requested entries overloaded node replicates popular entries at neighbors Uniform coordinate space partitioning uniformly spread (key,value) entries uniformly spread out routing load

CAN: Robustness Completely distributed
no single point of failure ( not applicable to pieces of database when node failure happens) Not exploring database recovery (in case there are multiple copies of database) Resilience of routing can route around trouble

Strengths More resilient than flooding broadcast networks
Efficient at locating information Fault tolerant routing Node & Data High Availability (w/ improvement) Manageable routing table size & network traffic

Weaknesses Impossible to perform a fuzzy search
Susceptible to malicious activity Maintain coherence of all the indexed data (Network overhead, Efficient distribution) Still relatively higher routing latency Poor performance w/o improvement

Suggestions Catalog and Meta indexes to perform search function
Extension to handle mutable content efficiently for web-hosting Security mechanism to defense against attacks

Ongoing Work Topologically-sensitive CAN construction - Distributed Binning
Goal bin nodes such that co-located nodes land in same bin Idea well known set of landmark machines each CAN node, measures its RTT to each landmark orders the landmarks in order of increasing RTT CAN construction place nodes from the same bin close together on the CAN

Distributed Binning 4 Landmarks (placed at 5 hops away from each other) naïve partitioning #dimensions=2 #dimensions=4 w/o binning w/ binning w/o binning w/ binning  20 15 latency Stretch 10 5 256 1K 4K 256 1K 4K number of nodes

Ongoing Work (cont’d) CAN Security (Petros Maniatis - Stanford)
spectrum of attacks appropriate counter-measures CAN Usage Application-level Multicast (NGC 2001) Grass-Roots Content Distribution Distributed Databases using CANs (J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang)

Summary CAN an Internet-scale hash table
potential building block in Internet applications Scalability O(d) per-node state average routing path is (dn1/d)/4 hops Low-latency routing simple heuristics help a lot Robust decentralized, can route around trouble

Peer-to-Peer Networks and Distributed Hash Tables

Similar presentations

Presentation on theme: "Peer-to-Peer Networks and Distributed Hash Tables"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peer-to-Peer Networks and Distributed Hash Tables

Similar presentations

Presentation on theme: "Peer-to-Peer Networks and Distributed Hash Tables"— Presentation transcript:

Similar presentations

About project

Feedback