Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen Scott Shenker This is a modified version of the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
Modeling and Analysis of Random Walk Search Algorithms in P2P Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE, Rensselaer Polytechnic Institute.
Structuring Unstructured Peer-to-Peer Networks Stefan Schmid Roger Wattenhofer Distributed Computing Group HiPC 2007 Goa, India.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Technion –Israel Institute of Technology Software Systems Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Building Low-Diameter P2P Networks Eli Upfal Department of Computer Science Brown University Joint work with Gopal Pandurangan and Prabhakar Raghavan.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao Cisco Systems, Inc. (Joint work with Christine Lv, Edith Cohen, Kai Li and Scott Shenker)
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Efficient Search in Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002.
Searching in Unstructured Networks Joining Theory with P-P2P.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Cache Updates in a Peer-to-Peer Network of Mobile Agents Elias Leontiadis Vassilios V. Dimakopoulos Evaggelia Pitoura Department of Computer Science University.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.
An affinity-driven clustering approach for service discovery and composition for pervasive computing J. Gaber and M.Bakhouya Laboratoire SeT Université.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Quasar A Probabilistic Publish-Subscribe System for Social Networks over P2P Kademlia network David Arinzon Supervisor: Gil Einziger April
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presenter: Chris Grier ECE 598nb Spring 2006.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Enabling Peer-to-Peer SDP in an Agent Environment University of Maryland Baltimore County USA.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Search in Unstructured P2p.
Peer to Peer Network Design Discovery and Routing algorithms
Aug 22, 2002Sigcomm 2002 Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen AT&T Labs-research Scott Shenker ICIR.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
School of Electrical Engineering &Telecommunications UNSW Cost-effective Broadcast for Fully Decentralized Peer-to-peer Networks Marius Portmann & Aruna.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer Data Management
CHAPTER 3 Architectures for Distributed Systems
Peer-to-Peer and Social Networks
A Scalable content-addressable network
Presentation by Theodore Mao CS294-4: Peer-to-peer Systems
Presentation transcript:

Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed supervising Winter Semester 2003

Background “Peer-to-Peer refers to a class of systems and applications that employ distributed resources (i.e., computing power, networking resources) to perform a task (i.e., content delivery, collaboration, e-commerce) in a decentralized manner on general-purpose operating systems and platforms. Peer-to-Peer (P2P) systems are increasingly becoming popular because they provide opportunities for real-time communication, ad-hoc collaboration and information sharing (e.g., illegal music-swapping in systems like Napster and Gnutella) in a large- scale distributed environment. Much research effort has focused lately on understanding the issues and solving the research problems in the P2P systems. In a dynamic environment, peers frequently join and leave the system.

Motivation P2P computing raises many interesting research problems in distributed systems. In this project we will investigate one of them, the lookup problem. How do you find any given data item in a large P2P system in a scalable manner, without any centralized servers or hierarchy? This problem is at the heart of any P2P system. It is not addressed well by most popular systems currently in use, and it provides a good example of how the challenges of designing P2P systems can be addressed.

The Goal of the Project  Implementation of the query algorithm based on multiple random walks as described in “Search and Replication in Unstructured Peer to Peer Networks” paper.“Search and Replication in Unstructured Peer to Peer Networks”  Execution of simulations with various parameters.  Analysis of the results.

Introduction Currently, there are several different architectures for P2P networks: Centralized: Napster and other similar systems have a constantly-updated directory hosted at central locations (e.g., the Napster web site). Nodes in the P2P network issue queries to the central directory server to find which other nodes hold the desired les. Decentralized but Structured: These systems have no central directory server, but they have a significant amount of structure. The les are placed not at random nodes but at specified locations that will make subsequent queries easier to satisfy. The the Freenet P2P network is an example of such systems. Decentralized and Unstructured: These are systems in which there is neither a centralized directory nor any precise control over the network topology or le placement. Gnutella is an example of such designs.

Introduction(cont)  We focus on Gnutella-like decentralized, unstructured P2P systems. To find a file, a node queries its neighbors. The most typical query method is flooding, where the query is propagated to all neighbors within a certain radius. These unstructured designs are extremely resilient to nodes entering and leaving the system. However, the current search mechanisms are extremely unscalable, generating large loads on the network participants.  We implement more-scalable alternatives to existing Gnutella algorithms, focusing on the search and replication aspects. We quantify the poor scaling properties of the flooding search algorithms. We then propose, as an alternative, a k-walker random walk algorithm that greatly reduces the load generated by each query.

Introduction What does it mean “Random walk” ?  Random walk is a well-known technique, which forwards a query message to a randomly chosen neighbor at each step until the object is found. We call this message a “walker".  The standard random walk (which uses only one walker) can cut down the message overhead by an order of magnitude compared to expanding ring across the network topologies. However, there is also an order of magnitude increase in user-perceived delay. We implement one walker algorithm

What does it look like? Why shouldn’t I find a song? A sends a walker to find song.mp3 that is stored on B

Introduction Replication  When a node finds a file, it also gets an information about source of the file, e.g. which node the file has been retrieved from. Sharing this information with other nodes in a P2P network should obviously improve performance of the P2P lookup mechanism.  Such sharing mechanism is called “replication”. In certain P2P systems such as Gnutella, only nodes that request an object make copies of the object. Other P2P systems such as Freenet allow for more proactive replications of objects, where an object may be replicated at a node even though the node has not requested the object.

Introduction There are various replication strategies. We implement three of them :  “Owner replication", where, when a search is successful, the object is stored at the requester node only. One can call this strategy as “No Replication”. Owner replication is used in systems such as Gnutella  “Path replication", where, when a search succeeds, the object is stored at all nodes along the path from the requester node to the provider node. Owner replication is used in systems such as Gnutella. Path replication is used in systems such as Freenet.  “Random replication”, where, when a search succeeds, we count the number of nodes on the path between the requester and the provider, p, then randomly pick p of the nodes that the k walkers visited to replicate the object.

Animation of Path replication ResourcceNode

Animation of Random replication ResourcceNode

Simulation terms and parameters  Buffer size – size of the node’s local tables, where it keeps information about its resources. There is the “Local Resources” table where a node keeps the resources it has, and the “Global Resources” table where it keeps “pointers” to some resources on the network. This table is a result of a replication mechanism. Both the tables have the same size.  Replication Ratio – the number of initial copies of some resource on the network.  Unique resources – the total number of different resources on the network.  Replication strategy – No Replication, Path Replication or Random Replication  TTL – the number of hops it takes a walker to find the requested resource.  Success ratio – the percentage of walkers which succeeded to find the requested resource.

Simulation Methodology We base our project on an already implemented code that generates a k-node network, where each node “knows” number of its neighbors. implement a Node, which simulates a single user workstation. As each workstation wakes up, it requests a vector of initial resources. The resources distribution is conducted according to the specific policy defined per simulation. As the network has been created, random requests begin to get generated. We used uniform distribution in choosing target nodes as well as resources. As a node accepts a new request, it unleashes walker to start looking for the resource. As the walker returns, the node updates its pool of resources as well as global statistics. On the Random and Path resources distribution, the node disperses the shortcuts of the found resource among other nodes, as described above.

Simulation Methodology - Statistics As the walker returns, each nodes updates the global statistics If a walker has found the resource, the global TTL gets updated according to the walker’s TTL field (=number of hops walker has passed until it found the resource), otherwise the global TTL is left unchanged. The global success_ratio field is influenced by whether or not the walker has found the requested resource We used 20,000 walkers per each simulation (network graph consisting of 100 or 200 nodes) and used an average of 3 simulations to calculate the result

Design – class diagram

Results & Analysis When used replication technique, TTL becomes much lower. There are virtually no differences in TTL between Path and Random replications on the 100 nodes graphs, however on the 200 nodes graphs and on larger buffer sizes, the Random Replication performance appears a bit better.

Results & Analysis Random Replication showed the best results on all simulations. The difference on 200 nodes graph was slight, as compared to 100 nodes graph. As expected, no replication performance was the worst.

Results & Analysis RR=1, 2: no apparent differences between types of replication RR=5: Slight differences begin to emerge, Random Replication shows the best results, after it - Path Replication. Reason for such low performance: There is not enough place in buffers for initial distribution of the resources, thus many initial resources get lost in the beginning.

Results & Analysis Again, as expected, Random Replication shows the lowest TTL, after it, at a slight difference comes the Path Replication. Later on there is a gap to the No Replication technique performance. There is also a substantial difference between RR = 1,2 and RR=5 (drop from TTL = ~25 to TTL = ~20).

Results & Analysis Again, as expected, Random Replication shows the best results in both cases.

Results & Analysis Success ratio is much lower than in the uniform size distribution case, since the probability of each resource not to find its place on the initial distribution is much higher. We changed the initial distribution algorithm (now each resource, if the node it is about to be distributed to is already full, will be distributed to a node with empty space) and also the number of distributed resources. As a result, the success ration increased (from ~15% to ~99%).