Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks Daniel Stutzbach, Reza Rejaie University of Oregon Subhabrata Sen AT&T Labs IEEE.

Slides:



Advertisements
Similar presentations
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Advertisements

Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Challenges in Making Tomography Practical
Multihoming and Multi-path Routing
Multihoming and Multi-path Routing
Challenge the future Delft University of Technology Overprovisioning for Performance Consistency in Grids Nezih Yigitbasi and Dick Epema Parallel.
1 A Static-Node Assisted Adaptive Routing Protocol in Vehicular Networks Yong Ding, Chen Wang, Li Xiao {dingyong, wangchen, Department.
Energy-Efficient Distributed Algorithms for Ad hoc Wireless Networks Gopal Pandurangan Department of Computer Science Purdue University.
1 Multi-Channel Wireless Networks: Capacity and Protocols Nitin H. Vaidya University of Illinois at Urbana-Champaign Joint work with Pradeep Kyasanur Chandrakanth.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
An analysis of Social Network-based Sybil defenses Bimal Viswanath § Ansley Post § Krishna Gummadi § Alan Mislove ¶ § MPI-SWS ¶ Northeastern University.
An Alliance based Peering Scheme for P2P Live Media Streaming Darshan Purandare Ratan Guha University of Central Florida August 31, P2P-TV, Kyoto.
Scalable Routing In Delay Tolerant Networks
Sampling Research Questions
UNIVERSITY OF JYVÄSKYLÄ New Topology Management Algorithms for Unstructured P2P Networks Presentation for The Second International Workshop on P2P Systems.
1  1 =.
Addition Facts
Measurement and Analysis of Online Social Networks 1 A. Mislove, M. Marcon, K Gummadi, P. Druschel, B. Bhattacharjee Presentation by Shahan Khatchadourian.
Correctness of Gossip-Based Membership under Message Loss Maxim GurevichIdit Keidar Technion.
Multipath Routing for Video Delivery over Bandwidth-Limited Networks S.-H. Gary Chan Jiancong Chen Department of Computer Science Hong Kong University.
Watching Television Over an IP Network & TV-Watching Behavior Research presented by Weiping He.
Peer-to-Peer and Social Networks An overview of Gnutella.
Two-Market Inter-domain Bandwidth Contracting
CP2073 Networking Lecture 5.
Fact-finding Techniques Transparencies
Respondent-driven Sampling for Characterizing Unstructured Overlays A. H. Rasti University of Oregon M. Torkjazi R. Rejaie N. Duffield AT&T Labs - Research.
1 Generating Network Topologies That Obey Power LawsPalmer/Steffan Carnegie Mellon Generating Network Topologies That Obey Power Laws Christopher R. Palmer.
Inferring Peer Centrality in Socially-Informed P2P Systems Nicolas Kourtellis, Adriana Iamnitchi Department of Computer Science & Engineering University.
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
TCP Probe: A TCP with Built-in Path Capacity Estimation Anders Persson, Cesar Marcondes, Ling-Jyh Chen, Li Lao, M. Y. Sanadidi, Mario Gerla Computer Science.
Effects on UK of Eustatic sea Level rise GIS is used to evaluate flood risk. Insurance companies use GIS models to assess likely impact and consequently.
Scale Free Networks.
Peter R. Pietzuch Peer-to-Peer Computing – or how to make your BitTorrent downloads go faster... Peter Pietzuch Large-Scale Distributed.
the Entity-Relationship (ER) Model
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
RED-PD: RED with Preferential Dropping Ratul Mahajan Sally Floyd David Wetherall.
Addition 1’s to 20.
25 seconds left…...
Week 1.
We will resume in: 25 Minutes.
The Connectivity and Fault-Tolerance of the Internet Topology
Walter Willinger AT&T Research Labs Reza Rejaie, Mojtaba Torkjazi, Masoud Valafar University of Oregon Mauro Maggioni Duke University HotMetrics’09, Seattle.
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
Improving Lookup Performance over a Widely-Deployed DHT Daniel Stutzbach Reza Rejaie The ION P2P Project University of.
 We developed a fast and tunable crawler, Cruiser.  Cruiser uses a master-slave architecture, parallel crawling, and leverages the two-tier topology.
Characterizing the Two-Tier Gnutella Topology  Gnutella, FastTrack, and eDonkey use two-tier overlay topologies.  Our initial study focuses on Gnutella.
Understanding Mesh-based Peer-to-Peer Streaming Nazanin Magharei Reza Rejaie.
Understanding Churn in Peer-to-Peer Networks Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Internet Measurement Conference.
1 Characterizing Files in the Modern Gnutella Network: A Measurement Study Shanyu Zhao, Daniel Stutzbach, Reza Rejaie University of Oregon SPIE Multimedia.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon.
On Unbiased Sampling for Unstructured Peer-to-Peer Networks Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Nick Duffield –
Amir Rasti Daniel Stutzbach Reza Rejaie The ION P2P Project University of Oregon On the Long-term Evolution of the Two-Tier.
Analyzing Peer-to-Peer Traffic Across Large Networks Jia Wang Joint work with Subhabrata Sen AT&T Labs - Research.
Presentation by Manasee Conjeepuram Krishnamoorthy.
WALKING IN FACEBOOK: A CASE STUDY OF UNBIASED SAMPLING OF OSNS junction.
Multimedia Computing & Networking Shanyu Zhao, Daniel Stutzbach, Reza Rejaie Multimedia & Internetworking Research Group (Mirage) Computer & Information.
Aemen Lodhi (Georgia Tech) Amogh Dhamdhere (CAIDA)
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
Content Distribution in Unstructured Peer-to-Peer Networks Daniel Stutzbach Committee Members: Professor Reza Rejaie Professor Ginnie Lo Professor Art.
"A Measurement Study of Peer-to-Peer File Sharing Systems" Stefan Saroiu, P. Krishna Gummadi Steven D. Gribble, "A Measurement Study of Peer-to-Peer File.
Sampling Techniques for Large, Dynamic Graphs Daniel Stutzbach – University of Oregon Reza Rejaie – University of Oregon Nick Duffield – AT&T Labs—Research.
Large-Scale Monitoring of DHT Traffic Ghulam Memon – University of Oregon Reza Rejaie – University of Oregon Yang Guo – Corporate Research, Thomson Daniel.
Impact of Neighbor Selection on Performance and Resilience of Structured P2P Networks Sushma Maramreddy.
Department of Computer Science University of York
Presentation transcript:

Characterizing Overlay Topologies & Dynamics in Peer-to-Peer Networks Daniel Stutzbach, Reza Rejaie University of Oregon Subhabrata Sen AT&T Labs IEEE Computer & Communications Workshop, Huntington Beach October 25 th, 2005

Slide 2/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Motivation P2P file-sharing systems are very popular in practice. Several million simultaneous users collectively. 60% of all Internet traffic [CacheLogic Research 2005] Most use an unstructured overlay. Understanding overlay properties & dynamics is important: Understanding how existing P2P systems function Developing and evaluating new systems Unstructured overlays are not well-understood. We characterized overlay topology in Gnutella because Size: one of the largest P2P systems; more than 1 million users Mature: In use for several years; older studies for comparisons Open: No reverse-engineering needed

Slide 3/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Defining the Problem Gnutella uses a two-tier overlay. Improves scalability. Ultrapeers form an unstructured mesh. Leaf peers connect to the ultrapeers. eDonkey, FastTrack are similar. Studying the overlay requires snapshots. Snapshots capture the overlay as a graph. Individual snapshots reveal graph properties. Consecutive snapshots reveal dynamics. However, capturing accurate snapshots is difficult. Top-level overlay Leaf Ultrapeer

Slide 4/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Challenges in Capturing Accurate Snapshots Snapshots are captured iteratively by a crawler. An ideal snapshot is instantaneous. But the overlay is large and rapidly changing. Captured snapshots are likely to be distorted. Previous studies captured either Complete snapshots with slow crawler => distorted Partial snapshots => less distorted, but unrepresentative Some types of analysis require the whole graph. Increasing crawler speed reduces distortion in captured snapshots.

Slide 5/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Cruiser: a Fast Gnutella Crawler Features: Distributed, highly parallelized implementation Dynamic adaptation to bandwidth & CPU constraints Cruiser is orders of magnitude faster than other P2P crawlers: Captures one million nodes in around 7 minutes 140,000 peers/min, compared to 2,500 peers/min [Saroiu 02] We investigated the effects of speed on distortion. 4% node distortion and 15% edge distortion Daniel Stutzbach and Reza Rejaie, Capturing Accurate Snapshots of the Gnutella Network, the Global Internet Symposium, March, 2005.

Slide 6/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Data Set More than 80,000 snapshots, over the past year. To examine static properties, we focus on four: To examine dynamic properties, we use slices: Each slice is 2 days of ~500 back-to-back snapshots Captured starting 10/14/04, 10/21/04, 11/25/04, 12/21/04, and 12/27/04 DateTotal NodesLeavesUltrapeersTop-level Edges 9/27/04725,120614,912110,2081,212,772 10/11/04779,535662,568116,9671,244,219 10/18/04806,948686,719120,2291,331,745 2/2/051,031,471873,130158,3451,964,121

Slide 7/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Summary of Characterizations Graph Properties Implementation heterogeneity Degree Distribution: Top-level degree distribution Ultrapeer-leaf connectivity Degree-distance correlation Reachability: Path lengths Eccentricity Small world properties Resiliency Dynamic Properties Existence of stable core: Uptime distribution Biased connectivity Properties of stable core: Largest connected component Path lengths Clustering coefficient

Slide 8/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Top-level Degree This is the degree distribution among ultrapeers. There are obvious peaks at 30 and 70 neighbors. A substantial number of ultrapeers have fewer than 30. What happened to the power-law reported by prior studies? Max 30 in most clients Max 75 in some clients Custom

Slide 9/18 CCW 2005http://mirage.cs.uoregon.edu/P2P What happened to power-law? When a crawl is slow, many short-lived peers report long-lived peers as neighbors. But those neighbors are not all present at the same time. Degree distribution from a slow crawl resembles prior results. [Ripeanu 02 ICJ]

Slide 10/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Shortest-Path Distances Distribution of distances among ultrapeers (left) 70% of distances are exactly 4 hops. Distribution of distances among all peers (right) Most distances are 5 or 6 hops. Shows the effect of the two-tier with multiple parents Despite large size, pair-wise distances are short.

Slide 11/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Small worlds arise naturally in many places. Movies actors, power grid, co-authors of papers Small world graphs have short distances, but significant clustering, compared to a similar random graph. Gnutella is a small world. Very high clustering adversely affects flooding queries. But Gnutella isnt too clustered to affect performance. Is Gnutella a Small World? Mean Distance Clustering Coefficient Gnutella Random

Slide 12/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Resiliency to Node Failure Ratio of connected peers after node failure. The Gnutella topology is extremely resilient to random node failure. Its resilient even when the highest-degree nodes are removed. Complex algorithms are not necessary to achieve resiliency. Random Highest degree first

Slide 13/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Dynamic Properties How does node churn affect overlay dynamics? Are some regions of the overlay more stable? How can we identify such a region? Methodology: Capture a long series of back-to-back snapshots Estimate the uptime of individual peers in the last snapshot Group peers with uptime higher than a threshold Examine biased connectivity within each group Newly arrived peer Departed peer Present for 2 snapshots Present for 5 snapshots Time

Slide 14/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Stable Core Most peers have a short uptime. Other peers have been around for a long time. Stable core: a set of peers with uptime higher than a threshold ( ). Higher threshold => more stable group of peers T > 20 h T > 10 h

Slide 15/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Biased Connectivity Hypothesis: long-lived nodes tend to be more connected to other long-lived nodes Rationale: Once connected, they stay connected. Long-lived peers have more opportunities to become neighbor. To quantify bias in the connectivity of the stable core: Randomize the edges to create a graph without biased connectivity. Compare the edges in the observed stable core with the randomized graph.

Slide 16/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Stable Core Edges 20%40% more edges in the stable core compared to random. Connectivity exhibits an onion-like biased connectivity where peers are more likely to connect to other peers with same/higher uptime. We examined other properties of the stable core. Despite high churn, there is a relatively stable backbone.

Slide 17/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Summary Characterizations of Gnutella overlay based on recent and accurate snapshots. Graph properties: The degree distribution in Gnutella is not power law. Gnutella exhibits small world characteristics. Gnutella is resilient. Dynamic properties: There is a stable core within the overlay topology. Peer churn causes the stable core to exhibit an onion-like biased connectivity. This effect is likely to occur in other unstructured P2P systems. Daniel Stutzbach, Reza Rejaie, Subhabrata Sen,Characterizing Unstructured Overlay Topologies in Modern P2P File-Sharing Systems, Internet Measurement Conference, Berkeley, 2005

Slide 18/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Future Work Examining underlying causes of the biased connectivity. Exploring long-term trends in overlay properties. Characterizing churn Characterizing properties of other widely- deployed P2P systems Kad (a DHT with more than 1 million users) BitTorrent Developing sampling techniques for P2P

Slide 19/18 CCW 2005http://mirage.cs.uoregon.edu/P2P Ultrapeer->Leaf Degree LimeWire ultrapeers have a limit of 30 leaf peers. BearShare ultrapeers have a limit of 45 leaf peers. There are distinct spikes at those points, with an even distribution of fewer leaf peers. LimeWire BearShare Other Custom