Making Gnutella-like P2P Systems Scalable Presented by: Karthik Lakshminarayanan Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott.

Slides:



Advertisements
Similar presentations
Data and Computer Communications
Advertisements

Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Cognitive Publish/Subscribe for Heterogeneous Clouds Šarūnas Girdzijauskas, Swedish Institute of Computer Science (SICS) Joint work with:
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Intel Research Seattle Sylvia Ratnasamy, Lee Breslau, Scott Shenker, and Nick Lanham.
Ýmir Vigfússon IBM Research Haifa Labs Ken Birman Cornell University Qi Huang Cornell University Deepak Nataraj Cornell University.
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Improving Gnutella Willy Henrique Säuberli Seminar in Distributed Computing, 16. November 2005 Papers: I.Making Gnutella-like P2P Systems Scalable; SIGCOMM.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Technion –Israel Institute of Technology Software Systems Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
1 SLIC: A Selfish Link-based Incentive Mechanism for Unstructured P2P Networks Qixiang Sun Hector Garcia-Molina Stanford University.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao Cisco Systems, Inc. (Joint work with Christine Lv, Edith Cohen, Kai Li and Scott Shenker)
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
CS522: Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian
1 Routing as a Service Karthik Lakshminarayanan (with Ion Stoica and Scott Shenker) Sahara/i3 retreat, January 2004.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.
P2P Group Meeting (ICS/FORTH) Monday, 21 February, 2005 Making Gnutella-like P2P Systems Scalable (Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
1 - CS7701 – Fall 2004 Review of: Making Gnutella-like P2P Systems Scalable Paper by: – Yatin Chawathe (AT&T) –Sylvia Ratnasamy (Intel) –Lee Breslau (AT&T)
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Full-Text Search in P2P Networks Christof Leng Databases and Distributed Systems Group TU Darmstadt.
Replication Strategies in Unstructured Peer-to-Peer Networks Edith CohenScott Shenker Some slides are taken from the authors’ original presentation.
Peer-to-Peer Services Lintao Liu 5/26/03. Papers YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology Stanford University Associative Search.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau (Several slides have been taken.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau Parts of it has been adopted from.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
On Heterogeneous Overlay Construction and Random Node Selection in Unstructured P2P Networks Presenter: 游創文.
A Utility-based Approach to Scheduling Multimedia Streams in P2P Systems Fang Chen Computer Science Dept. University of California, Riverside
"A Measurement Study of Peer-to-Peer File Sharing Systems" Stefan Saroiu, P. Krishna Gummadi Steven D. Gribble, "A Measurement Study of Peer-to-Peer File.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
An overview of Gnutella
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Search in Unstructured P2p.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
On Reducing Mesh Delay for Peer- to-Peer Live Streaming Dongni Ren, Y.-T. Hillman Li, S.-H. Gary Chan Department of Computer Science and Engineering The.
Aug 22, 2002Sigcomm 2002 Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen AT&T Labs-research Scott Shenker ICIR.
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
School of Electrical Engineering &Telecommunications UNSW Cost-effective Broadcast for Fully Decentralized Peer-to-peer Networks Marius Portmann & Aruna.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Advanced Computer Networks: Part 2 Complex Networks, P2P Networks and Swarm Intelligence on Graphs.
Decentralized Trust Management for Ad-Hoc Peer-to-Peer Networks Thomas Repantis Vana Kalogeraki Department of Computer Science & Engineering University.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Unstructured Networks: Search Márk Jelasity. 2 Outline ● Emergence of decentralized networks ● The Gnutella network: how it worked and looked like ● Search.
Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks Sushma Maramreddy.
GIA: Making Gnutella-like P2P Systems Scalable
Replica Placement Heuristics of Application-level Multicast
Presentation transcript:

Making Gnutella-like P2P Systems Scalable Presented by: Karthik Lakshminarayanan Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott Shenker

Central philosophy of the work File-sharing is a dominant P2P application DHTs might not be suitable for file-sharing Gnutella’s design –Simplicity –Unscalable (number of queries and system size) Improve Gnutella –Adapt overlay topology and search algorithms to accommodate heterogeneity

Why not DHTs? P2P clients are extremely transient –Can DHTs handle churn as well as unstructured? –How would Bamboo compare with Gnutella? Keyword searches are more prevalent –Inverted indices might not scale –No unambiguous naming convention Most queries are for hay –Well-replicated content is queried for most

Gnutella’s scaling problems Gnutella performs flooding-based search –Find files if they are replicated at small number of nodes –Obvious scaling issues Random walks –Forwarding is oblivious to node contents –Forwarding is oblivious to node load Bias towards high degree –Node capacity still not taken into account

GIA design 1.Dynamic topology adaptation –Nodes are close to high-capacity nodes 2.Active flow control scheme –Avoid overloaded hot-spots –Explicitly handles heterogeneity 3.One-hop replication of pointers to content –Allows high-capacity nodes to answer more queries 4.Search protocol –Based on random walks towards high-capacity nodes Exploit heterogeneity

1. Topology adaptation Goal: Make high-capacity nodes have high degree (i.e., more neighbors) Each node has a level of satisfaction, S –S = 0 if no neighbors (dissatisfied) –S = 1 if enough good neighbors (fully satisfied) –S is a function of capacity, degree, age of neighbors and capacity of node –Improve the neighbor set as long as S < 1

1. Topology adaptation Improving neighbor set –Pick a new neighbor –Decide whether to preempt an existing neighbor Depends on degree, capacity of neighbors Asymmetric links? Issues –Avoid oscillations – use hysteresis –Converge to a stable state

2. Proactive flow control Allocate tokens to neighbors based on processing capability –Cannot perform arbitrary dropping due to random walk mechanism of GIA Allocation is proportional to neighbors’ capacities –Incentive to announce true capacities Uses token assignment based on SFQ

3. One-hop replication Each GIA node maintains index of contents of all neighbors Exchanged during neighbor setup Periodically incrementally updated Flushed on node failures

4. Search protocol Biased random walk –Pick highest capacity node to which it has tokens –If no tokens, queues till tokens arrive TTLs to bound duration of random walks Book-keeping –Maintain list of neighbors to which a query (unique GUID) has been forwarded

Simulation results Compare four systems –FLOOD: TTL-scoped, random topologies –RWRT: Random walks, random topologies –SUPER: Supernode-based search –GIA: search using GIA protocol suite Metric: –Success-rate, Delay, Hop-count Knee/collapse point at a particular query rate –Collapse point: Per-node query rate at the knee Aggregate throughput that the system can sustain

System model Capacities of nodes based on UW study –Separated by 4 orders of magnitude Query generation rate for each node –Limited by node capacity Keyword queries are performed –Files are randomly replicated Control traffic consumes resources Use uniformly random graphs –Prevent bias against FLOOD and RWRT

Questions addressed by simulations What is the relative performance of the four algorithms? Which of the GIA components matters the most? What is the impact of heterogeneity? How does the system behave in the face of transient nodes?

Single search response GIA outperforms SUPER, RWRT & FLOOD by many orders of magnitude in terms of aggregate query load Also scales to very large size network as replication factor determines scalability

Factor Analysis No single component is useful by itself; the combination of all of them is what makes GIA scalable AlgorithmCollapse point RWRT RWRT+OHR RWRT+BIAS RWRT+TADAPT RWRT+FLWCTL AlgorithmCollapse point GIA 7 GIA – OHR GIA – BIAS 6 GIA – TADAPT 0.2 GIA – FLWCTL nodes, 0.1% replication

Impact of Heterogeneity GIA improves under heterogeneity Large CP-HC for GIA under uniform capacities as queries are directed towards high capacity nodes nodes, 0.1% replication

Node failures Even under heavy churn GIA outperforms the other algorithms (under no churn) by many orders of magnitude nodes GIA system

Implementation Capacity settings –Bandwidth, CPU, disk access –Configured by user Satisfaction level –Based on capacity, degree, age of neighbors and capacity of node –Adaptation interval I = T. K -(1-s), K = degree of aggressiveness Query resilience –Keep-alive message periodically sent –Optimizations on adaptation to avoid query dropping

Deployment Ran GIA on 83 nodes of PlanetLab for 15 min Artificially imposed capacities on nodes Progress of topology adaptation shown

Conclusions GIA: scalable Gnutella –3–5 orders of magnitude improvement in system capacity Unstructured approach is good enough! –DHTs may be overkill –Incremental changes to deployed systems Can DHTs be used for file-sharing at all?