Peer-to-Peer Services Lintao Liu 5/26/03. Papers YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Stanford University Associative Search.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Fabian Kuhn, Microsoft Research, Silicon Valley
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
CS 620 Advanced Operating Systems Lecture 4 – Distributed System Architectures Professor Timothy Arndt BU 331.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Qixiang Sun Prasanna Ganesan Hector Garcia-Molina Stanford University.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Searching in Unstructured Networks Joining Theory with P-P2P.
Wide-area cooperative storage with CFS
“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
A distributed Search Service for Peer-to-Peer File Sharing in Mobile Applications From U. of Dortmund, Germany.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presentation by Theodore Mao CS294-4: Peer-to-peer Systems August 27, 2003.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Peer-to-Peer Systems. 2 Introduction What is peer One that of equal standing with another Peer-to-peer A way of structure distributed applications Each.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Efficient P2P Searches Using Result-Caching From U. of Maryland. Presented by Lintao Liu 2/24/03.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
An Improved Kademlia Protocol In a VoIP System Xiao Wu , Cuiyun Fu and Huiyou Chang Department of Computer Science, Zhongshan University, Guangzhou, China.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Kaleidoscope – Adding Colors to Kademlia Gil Einziger, Roy Friedman, Eyal Kibbar Computer Science, Technion 1.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Peer to Peer Network Design Discovery and Routing algorithms
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
An overlay for latency gradated multicasting Anwitaman Datta SCE, NTU Singapore Ion Stoica, Mike Franklin EECS, UC Berkeley
Decentralized Trust Management for Ad-Hoc Peer-to-Peer Networks Thomas Repantis Vana Kalogeraki Department of Computer Science & Engineering University.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Paraskevi Raftopoulou, Euripides G.M. Petrakis
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

Peer-to-Peer Services Lintao Liu 5/26/03

Papers YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Stanford University Associative Search in Peer-to-Peer Network: Harnessing Latent Semantics AT&T Research Lab & Tel Aviv Univ. Cooperative Peer Groups in NICE Univ. of Maryland

YAPPERS: A P2P Lookup Service over Arbitrary Topology Motivation: Gnutella-style Systems work on arbitrary topology, flood for query Robust but inefficient Support for partial query, good for popular resources DHT-based Systems Efficient lookup but expensive maintenance By nature, no support for partial query Solution: Hybrid System Operate on arbitrary topology Provide DHT-like search efficiency

Design Goals Impose no constraints on topology No underlying structure for the overlay network Optimize for partial lookups for popular keys Observation: Many users are satisfied with partial lookup Contact only nodes that can contribute to the search results no blind flooding Minimize the effect of topology changes Maintenance overhead is independent of system size

Query Model: pair TotalLookup(N, k): Return all values associated with k Query initiates from Node N PartialLookup(N, k, n): Return n values associated with k If total available < n, return all available

Basic Idea: Keyspace is partitioned into a small number of buckets. Each bucket corresponds to a color. Each node is assigned a color. # of buckets = # of colors Each node sends the pairs to the node with the same color as the key within its Immediate Neighborhood. IN(N): All nodes within h hops from Node N.

One simple example: 2 colors

Immediate Neighborhood Different node has different IN set. Two problems: Consistency: same node should get the same color in different IN set. Stability: same node is always assigned with the same color even it leaves/joins. Solution: Hashing the IP address => color

More … When node X is inserting Multiple nodes in IN(X) have the same color? No node in IN(X) has the same color as key k? Solution: P1: randomly select one P2: Backup scheme: Node with next color Primary color (unique) & Secondary color (zero or more) Problems coming with this solution: No longer consistent and stable The effect is isolated within the Immediate neighborhood

Extended Neighborhood IN(A): Immediate Neighborhood F(A): Frontier of Node A All nodes that are directly connected to IN(A), but not in IN(A) EN(A): Extended Neighborhood The union of IN(v) where v is in F(A) Actually EN(A) includes all nodes within 2h + 1 hops Each node needs to maintain these three set of nodes for query.

The network state information for node A (h = 2)

Searching with Extended Neighborhood Node A wants to look up a key k of color C(k), it picks a node B with C(k) in IN(A) If multiple nodes, randomly pick one If none, pick the backup node B, using its EN(B), sends the request to all nodes which are in color C(k). The other nodes do the same thing as B. Duplicate Message problem: Each node caches the unique query identifier.

More on Extended Neighborhood All pairs are stored among IN(X). (h hops from node X) Why each node needs to keep an EN(X)? Advantage: The forwarding node is chosen based on local knowledge Completeness: a query (C(k)) message can reach all nodes in C(k) without touching any nodes in other colors (Not including backup node)

Maintaining Topology Edge Deletion: X-Y Deletion message needs to be propagated to all nodes that have X and Y in their EN set Necessary Adjustment: Change IN, F, EN sets Move pairs if X/Y is in IN(A) Edge Insertion: Insertion message needs to include the neighbor info So other nodes can update their IN and EN sets

Maintaining Topology Node Departure: a node X with w edges is leaving Just like w edge deletion Neighbors of X initiates the propagation Node Arrival: X joins the network Ask its new neighbors for their current topology view Build its own extended neighborhood Insert w edges.

Problems with basic design Fringe node: Those low connectivity node allocates a large number of secondary colors to its high- connectivity neighbors. Large fan-out: The forwarding fan-out degree at A is proportional to the size of F(A) This is desirable for partial lookup, but not good for full lookup

A is overloaded by secondary colors from B, C, D, E

Solutions: Prune Fringe Nodes: If the degree of a node is too small, find a proxy node. Biased Backup Node Assignment: X assigns a secondary color to y only when a * |IN(x)| > |IN(y)| Reducing Forward Fan-out: Basic idea: try backup node, try common nodes

Experiment: H = 2 (1 too small, >2 EN too large) Topology: Gnutella snapshot Exp1: Search Efficiency

Distribution of colors per node

Fan-out:

Num of colors: effect on Search

Num of colors: effect on Fan-out

Conclusion and discussion Each search only disturbs a small fraction of the nodes in the overlay. No restructure the overlay Each node has only local knowledge scalable Discussion: Hybrid (unstructured and local DHT) system …

Associative Search in P2P networks: Harnessing Latent Semantics Motivation: Similar with the previous paper Avoid a blind search on unstructured P2P Not sensitive to node join/failure Proposed system: Associative Overlays Based on unstructured architecture Improve the efficiency of locating rare items by “orders of magnitude”.

Basic Idea: Exploit association inherent in human selections Steer the search process to peers that are more likely to answer a query Guide Rule: A set of peers that satisfy some predicate A search message is only flooding inside one specific Guile Rule.

Guide Rule

More on Guile Rule: Peers in one Guile Rule Contain data items that are semantically similar. E.g. Contain documents on philosophy, songs… Choose a Guile Rule for query message: “Networking” aspect (connectivity properties…) “data mining” aspects: Tules distill common interests Maintenance: Maintain a small list of other peers Possession Rules Automatically extracted guide rules from the shared items

Algorithms RAPIER Algorithm: Random Possession Rule If a node is interested in A, B, C, etc, just randomly choose one and perform a blind search amongst that rule More efficient than blink search GAS Algorithm: Greedy Guide Rule Deciding the order to try different rules Each node invokes a query in rules that have been more effective on its past queries.

Conclusion and Discussion It offers orders of magnitude improvement in the scalability of locating infrequent items Discussion: the advantage is generated by classifying unstructured peers into some loose so-called “Guile Rules”. Didn’t understand how they clearly defined these Rules and maintain it. A lot of mathematics and data mining stuff

Cooperative Peer Groups in NICE Trust management in P2P environment Design: Reputation information is stored in a completely decentralized manner. Efficient identify non-cooperative nodes Basic operation: After a transaction between A and B A will sign a cookie and send it to B, B does the same thing Cookies can be dropped as A/B likes.

Trust Graph Vertex: Nodes If A has a cookie issued by B, we draw an edge from B to A. The weight of the edge is the value given by B in the cookie (1/0 in this paper) A can calculate how much A should believe B Two different ways proposed

Distributed Trust Reference: A wants to do a transaction with B: A needs to present some cookies to B that A is “good” If A has a cookie assigned by B, A just sends it to B. If A doesn’t have a cookie from B, so A should collect some related cookies to make B believes that it is “good”.

Distributed Trust Inference: Detailed Procedures: A initiates a search for B’s cookies Recursively search its neighbors This will generate a trust graph starting from B and ending at A. A presents all the cookies to B B will decide whether A is “good” or not It is A who collects the cookies to Avoid DoS attack.

Refinements Efficient searching Each node caches the cookies of its neighbors When A searches for B’s cookies, A checks whether its neighbors have. If no, randomly choose neighbors to forward This is like flooding: if a neighbor fails, try next This random walk is limited to predefined steps.

Refinement: Negative Cookies: If A has present B enough cookies to make sure A is a “good” peer, B can search for “Negative cookies” of A to make sure. Search follows the high trust edges out of B Preference Lists: Nodes that are highly probably to be “good” A will try to do transaction with those nodes.

Conclusion and discussion: Scalable Resistant to some attack. But many aspects still don’t look very nice yet. Discussion??