Slides for Chapter 10: Peer-to-Peer Systems

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Exercises for Chapter 10: Peer-to-Peer Systems
Slides for Chapter 10: Peer-to-Peer Systems From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Addison-Wesley.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Peer-To-Peer Systems Chapter 10 B. Ramamurthy. 6/25/2015B.RamamurthyPage 2 Introduction Monolithic application Simple client-server Multi-tier client-server.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Slides for Chapter 1 Characterization of Distributed Systems From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3,
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Introduction Widespread unstructured P2P network
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Exercises for Chapter 3: Networking.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Exercises for Chapter 2: System models
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Network Services Networking for Home and Small Businesses – Chapter 6.
Bruce Hammer, Steve Wallis, Raymond Ho
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 10: Peer-to-Peer.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Exercises for Chapter 10: Peer-to-Peer Systems Peer-to-Peer Systems
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Exercises for Chapter 12: Distributed.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
1 MSCS 237 Communication issues. 2 Colouris et al. (2001): Is a system in which hardware or software components located at networked computers communicate.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Peer-to-peer systems Chapter Outline Introduction Napster and its legacy Peer-to-peer middleware Routing overlay Pastry 2.
Peer-to-peer systems Chapter Outline Introduction Napster and its legacy Peer-to-peer middleware Routing overlay Pastry 2.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.
Distributed Computing Peer to Peer Computing Chapter 10: PEER TO PEER SYSTEMS.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Bruce Hammer, Steve Wallis, Raymond Ho
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Tuesday, February 03, 2009 “Work expands to fill the time available for its completion.” - Parkinson’s 1st Law.
Peer-to-peer systems ”Sharing is caring”. Why P2P? Client-server systems limited by management and bandwidth P2P uses network resources at the edges.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
Peer-to-Peer Information Systems Week 12: Naming
Instructor Materials Chapter 5 Providing Network Services
WEB SERVICES From Chapter 19 of Distributed Systems Concepts and Design,4th Edition, By G. Coulouris, J. Dollimore and T. Kindberg Published by Addison.
Exercises for Chapter 8: Distributed File Systems
Peer-To-Peer Systems Chapter 10 B. Ramamurthy.
Peer-To-Peer Systems Chapter 10 B. Ramamurthy.
Indirect Communication Paradigms (or Messaging Methods)
Slides for Chapter 1 Characterization of Distributed Systems
Indirect Communication Paradigms (or Messaging Methods)
Peer-To-Peer Systems Chapter 10 B. Ramamurthy.
WEB SERVICES From Chapter 19, Distributed Systems
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

Slides for Chapter 10: Peer-to-Peer Systems From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Addison-Wesley 2005 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Introduction Peer-to-peer systems share these characteristics Each system share resources They may unique or duplicate Correct operation does not depends on any server Limited degree Algorithms for placing resources Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.1: Distinctions between IP and overlay routing for peer-to-peer applications Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Napster and its legacy Napster is file sharing system which provides a means for users to share files. The first application in which a demand for a globally-scalable information storage and retrieval service emerged was the downloading of digital music files. Napster became very popular for music exchange after its launch in 1999. Several million users were registered and thousands were swapping music files simultaneously. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Napster and its legacy Napster architecture included centralized indexes but users supplied the files, which were stored and accessed on their personnel systems. Napster method of operation as follows in step by step. 1.File location Request 2.List of peers offering the files 3.File request 4.File delivered 5.Index update Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.2: Napster: peer-to-peer file sharing with a centralized, replicated index Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Napster and its legacy Napster used a (replicated) unified index of all available music files. Object discovery and addressing is likely to become a bottleneck. Music files are never updated, avoiding any need to make all the replicas of file consistent after updates. No guarantees are require concerning the availability of individual files – if a music file is temporarily unavailable, it can be downloaded later when it available. This reduces the requirement for dependability of individual computers and their connections to the internet. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware A key problem in the design of peer-to-peer applications is to provide a mechanism to enable clients to access data resources quickly and dependably wherever they are located throughout the network. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware Middleware layers for the application-independent management of distributed resources on a global scale. Several research teams have completed peer-to-peer middleware platforms Example: Pastry, Tapestry, CAN, Chord and Kademlia These platforms are designed to place resources on a set of computers that are widely distributed throughout the internet. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware Functional requirements The function of peer-to-peer middleware is to simplify the constructions of services that are implemented across many hosts in a widely distributed network. Other important requirement includes the ability to add new resources and to remove them at will and to add hosts to the service and remove them. Like other middleware, peer-to-peer middleware should offer simple programming interface. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware Non-functional requirements To perform effectively, peer-to-peer middleware must also address the following non-functional requirements. 1.Global scalability: peer-to-peer middleware must therefore be designed to support applications that access millions of objects on tens of thousands of hundreds of thousands of hosts. 2.Load balancing: the performance of any system designed to exploit a large number of computers depends upon the balanced distribution of workload across them. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware 3.Optimizationf for local interactions between neighboring peers: the middle ware should aim to place resources close to the nodes that access them the most. 4.Accommodating to highly dynamic host availability: Peer-to-peer systems are constructed from host computers that are free to join or leave the system at any time. 5.Security of data in an environment with heterogeneous trust: use of authentication and encryption mechanisms to ensure the integrity and privacy of information. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware 6.Anonymity, deniability and resistance to censorship: A related requirement is that the hosts that hold data should be able plausibly to deny responsibility for holding or supplying it. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Peer-to-Peer middleware The design of a middleware layer to support global-scale peer-to-peer systems is therefore a difficult problem. Knowledge of the locations of objects must be partitioned and distributed throughout the network. In a network each node responsible for maintaining detailed knowledge of the locations of nodes and objects in a portion of the name space as well as a general knowledge of the topology of the entire name space. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.3: Distribution of information in a routing overlay Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Routing overlays A distributed algorithm known as a routing overlay takes responsibility for locating nodes and objects. The GUID used to identify nodes and objects are an example of the pure name also known as opaque identifiers, since they reveal nothing about the locations of the objects to which they refer. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Routing overlays The main tasks of routing overlay Routs the request to a node at which a replica of the object resides. The routing overlay must also perform some other tasks: To make a new object available to a peer-to-peer service computes a GUID for the object and announces it to the routing overlay. When clients request the removal of objects from the service the routing overlay must make them unavailable. Nodes may join and leave the service. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Deletes all references to GUID and the associated data. Figure 10.4: Basic programming interface for a distributed hash table (DHT) as implemented by the PAST API over Pastry put(GUID, data) The data is stored in replicas at all nodes responsible for the object identified by GUID. remove(GUID) Deletes all references to GUID and the associated data. value = get(GUID) The data associated with GUID is retrieved from one of the nodes responsible it. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Makes the object corresponding to GUID inaccessible. Figure 10.5: Basic programming interface for distributed object location and routing (DOLR) as implemented by Tapestry publish(GUID ) GUID can be computed from the object (or some part of it, e.g. its name). This function makes the node performing a publish operation the host for the object corresponding to GUID. unpublish(GUID) Makes the object corresponding to GUID inaccessible. sendToObj(msg, GUID, [n]) Following the object-oriented paradigm, an invocation message is sent to an object in order to access it. This might be a request to open a TCP connection for data transfer or to return a message containing all or part of the object’s state. The final optional parameter [n], if present, requests the delivery of the same message to n replicas of the object. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Pastry Overlay Case Studies: Pastry, Tapestry The prefix routing approach is adopted by both Pastry and Tapestry. Pastry All the nodes and objects that can be accessed through Pastry are assigned 128-bit GUIDs. For nodes, these are computed by applying a secure hash function (such as SHA-1) to the public key with which each node is provided. For objects such as files, the GUID is computed by applying a secure hash function to the object’s name or to some part of the object’s stored state. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Overlay Case Studies: Pastry, Tapestry It will correctly route a message addressed to any GUID in O(log N) steps. Routing involves the use of an transport protocol (normally UDP) to transfer the message to a Pastry node that is ‘closer’ to its destination. It is fully self-organizing Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Overlay Case Studies: Pastry, Tapestry Routing algorithm · The full routing algorithm involves the use of a routing table at each node to route messages efficiently Stage I: simplified form of algorithm. Stage II: The full Pastry algorithm and shows how efficient routing is achieved with the aid of routing tables. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10. 6: Circular routing alone is correct but inefficient Figure 10.6: Circular routing alone is correct but inefficient Based on Rowstron and Druschel [2001] The dots depict live nodes. The space is considered as circular: node 0 is adjacent to node (2128-1). The diagram illustrates the routing of a message from node 65A1FC to D46A1C using leaf set information alone, assuming leaf sets of size 8 (l = 4). This is a degenerate type of routing that would scale very poorly; it is not used in practice. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.7: First four rows of a Pastry routing table Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.8: Pastry routing example Based on Rowstron and Druschel [2001] Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Tapestry Overlay Case Studies: Pastry, Tapestry Tapestry implements a distributed hash table and routes messages to nodes based on GUIDs associated with resources using prefix routing in a manner similar to Pastry. But Tapestry’s API conceals the distributed hash table from applications behind a DOLR interface. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.10: Tapestry routing From [Zhao et al. 2004] Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Overlay Case Studies: Pastry, Tapestry Replicated resources are published with the same GUID. In Tapestry 160-bit identifiers are used to refer both to objects and to the nodes that perform routing actions. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Application Case Studies: Squirrel, OceanStore Squirrel web cache The authors of Pastry have developed the Squirrel peer-to-peer web caching service for use in local networks of personal computers Web caching · Web browsers generate HTTP GET requests for Internet objects like HTML pages, images, etc. Squirrel · The Squirrel web caching service performs the same functions using a small part of the resources of each client computer on a local network. The SHA-1 secure hash function is applied to the URL of each cached object to produce a 128-bit Pastry GUID. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Application Case Studies: Squirrel, OceanStore The reduction in total external bandwidth used: The total external bandwidth is inversely related to the hit ratio, since it is only cache misses that generate requests to external web servers The latency perceived by users for access to web objects: The use of a routing overlay results in several message transfers (routing hops) across the local network to transmit a request from a client to the host responsible for caching the relevant object (the home node). The computational and storage load imposed on client nodes: The average number of cache requests served for other nodes by each node over the whole period of the evaluation was extremely low Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

OceanStore Application Case Studies: Squirrel, OceanStore The developers of Tapestry have designed and built a prototype for a peer-to-peer file store The design includes provision for the replicated storage of both mutable and immutable data objects Storage organization · OceanStore/Pond data objects are analogous to files, with their data stored in a set of blocks. But each object is represented as an ordered sequence of immutable versions that are kept forever. Any update to an object results in the generation of a new version. The versions share any unchanged blocks, following the copy-on-write technique for creating and updating objects Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.11: Storage organization of OceanStore objects AGUID VGUID of current certificate version version i+1 VGUID of version i d1 d2 BGUID (copy on write) d3 root block version i indirection blocks data blocks d1 d2 d3 d4 d5 VGUID of version i-1 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Figure 10.12: Types of identifier used in OceanStore Name Meaning Description BGUID block GUID Secure hash of a data block VGUID version GUID BGUID of the root block of a version AGUID active GUID Uniquely identifies all the versions of an object Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000

Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 © Addison-Wesley Publishers 2000