Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peer-To-Peer Systems Chapter 10 B. Ramamurthy.

Similar presentations


Presentation on theme: "Peer-To-Peer Systems Chapter 10 B. Ramamurthy."— Presentation transcript:

1 Peer-To-Peer Systems Chapter 10 B. Ramamurthy

2 Different Types of Systems
Monolithic application Simple client-server Multi-tier client-server Request-response Pull /Push mode Tightly/loosely coupled Centralized and distributed systems Master-slave systems (Hadoop and HDFS) Peer-to-peer systems (Bittorrent) The concept of overlays B.Ramamurthy 4/8/2019

3 Introduction Peer-to-peer systems represents a paradigm for the construction of distributed systems and applications in which data and computational resources are contributed by many hosts on the Internet. Precipitated by explosive growth of networking and need for sharing resources. Peer to peer systems are emerging that have the capacity share computing resources, storage, data (audio, video, genetic) present at the edges of the internet on a global scale. They are very effective when used to store very large collection of immutable data. Typically does not work for data that is frequently updated. Low overhead. B.Ramamurthy 4/8/2019

4 Important Characteristics
Their design ensures that each user contributes resources to the system. The nodes may not contributing resources at the same level. All nodes in a peer-to-peer system have the same functional capabilities and responsibilities. Their correct operation does nor depend on the existence of any centrally administered systems. They can be designed to provide limited anonymity to the providers and users of the resources. B.Ramamurthy 4/8/2019

5 Key Issues Algorithms for placement of data objects across many hosts.
Access to them in a manner that balances the workload and ensures availability without adding undue overheads. Use existing naming, routing, data replication and security techniques. Build a reliable resource-sharing layer over an unreliable and untrusted collection of computers and networks. B.Ramamurthy 4/8/2019

6 Four Generations of P2P First generation is Napster music exchange server Second generation: file sharing with greater scalability, anonymity, and fault tolerance Freenet, Gnutella, Kazaa, BitTorrent Third generation characterized by emergence of middleware for application independence Pastry, Tapestry, … Fourth Generation – IPFS – inter-planetary file system B.Ramamurthy 4/8/2019

7 Different types of Routing
IP vs Overlay routing B.Ramamurthy 4/8/2019

8 IP vs. overlay routing for peer-to-peer applications
level routing over l ay Scale IPv4 is li m ited to 2 32 addressable n odes. The Peer - to - peer systems can address m ore objects. IPv6 name space is much more g ener o us The GUID name space is very large and flat (2 128 ), but add r esses in both versi o ns are (>2 128 ), allowing it to be much more fully hierarch ically structured a n d much of the sp a ce occupi e d. is pre - allocated accordi n g to administrative requirements. Load balanc i ng Loads on routers are determin e d by netwo r k Object locations can be ra n domized a nd h e nce topology a nd ass o ciated traffic patterns. traffic patterns are divorced from the network topology. Network dynamics IP routing tables are updated asy n chr o nously o n Routing tables can be u p dated s y nchr o no u sly or (addition/deletion of a best - efforts basis with time constants on the asynch r on o usly with fractions of a second objects/no d es) order of 1 hour. delays. Fault tolerance Redun d ancy is desi g ned into the IP network by Routes and object refer e nces can b e replicated its managers, ensuring toleran c e of a single n - fold, ensuring toleran c e of n failures of nodes router or network co n nectivity failure. n - fold or conn e ctions. replication is costly. Target identificatio n Each IP address maps to exactly one target Messages can be rout e d to the nearest replica of node. a target object. Security and a no n ymity Addressing is only secu r e when all nodes are Security can be achiev e d ev e n in environm e nts trusted. Anonymity for the owners of addr e sses with lim i ted trust. A lim i ted degree of is not achievable. ano n ymity can be pr o vided. B.Ramamurthy 4/8/2019

9 Curious to know how a GUID is generated?
Determine the values for the UTC-based timestamp and clock sequence to be used in the UUID For the purposes of this algorithm, consider the timestamp to be a 60-bit unsigned integer and the clock sequence to be a 14-bit unsigned integer. Sequentially number the bits in a field, starting with zero for the least significant bit. Set the time_low field equal to the least significant 32 bits (bits zero through 31) of the timestamp in the same order of significance. Set the time_mid field equal to bits 32 through 47 from the timestamp in the same order of significance. Set the 12 least significant bits (bits zero through 11) of the time_hi_and_version field equal to bits 48 through 59 from the timestamp in the same order of significance. Set the four most significant bits (bits 12 through 15) of the time_hi_and_version field to the 4-bit version number corresponding to the UUID version being created, as shown in the table above. Set the clock_seq_low field to the eight least significant bits (bits zero through 7) of the clock sequence in the same order of significance. B.Ramamurthy 4/8/2019

10 More on GUID Are usually secure hash of attributes of the resource: so you don’t expect the state of the resource to change. Hash of global clock + resource state + ip address + Remember N-fold replication possible, thus many replications of an object of a given GUID may be present. B.Ramamurthy 4/8/2019

11 Overlay Routing It is different than IP routing however is strongly influenced by IP routing. Routing tables may be updated synchronously or asynchronously. N-fold replication affects the routing. B.Ramamurthy 4/8/2019

12 Napster: peer-to-peer file sharing with a centralized, replicated index
B.Ramamurthy 4/8/2019

13 Napster (contd.) Napster demonstrated the feasibility of building a useful large-scale service which depends wholly on individual computers. Music files are not updated; state of resource stable No guarantees about availability; it fit well for music files Issue 1: one central server for index Issue (?) 2: Anonymity of the users B.Ramamurthy 4/8/2019

14 Issue 3: Copyright Issues
The developers of Napster argued that they are liable for the infringement of the copyrights of the owners because they were not participating in the copying process, which was done entirely between users machines. Their argument failed because the index servers were centralized and were an essential part of the process. Since index servers were located at well-known addresses, their operations were unable to remain anonymous and so could be targeted in law suits. B.Ramamurthy 4/8/2019

15 Lessons Learned from Napster
Naspter demonstrated the feasibility of building a useful large-scale service that depends wholly on data and computing resources owned by ordinary users. Naspter used locality to avoid swamping of data transfers. It took advantage of the special characteristics of the application for which it was designed in other ways: Music files are not updated: many replicas possible No guarantees required about availability of individual files. B.Ramamurthy 4/8/2019

16 Freenet and Gnutella Napster maintained a unified index of available files (resources) Gnutella and Freenet used partitioned and distributed indexes and algorithms specific to each system “file location” problem: network file system (NSF) like systems requires substantial configuration. B.Ramamurthy 4/8/2019

17 Peer-to-Peer Middleware
Functional requirements: Enable clients to locate and communicate with any individual resource Add and remove nodes Add and remove resources Simple API Non-functional requirements: global scalability, load balancing, accommodating to highly dynamic host availability, trust, anonymity, deniability… B.Ramamurthy 4/8/2019

18 Routing Overlay Routing overlay locates nodes and objects. It is middleware layer responsible for routing requests from clients to hosts that holds the object to which request is addressed. Main difference is that routing is implemented is in the application layer (besides the IP routing at network layer) B.Ramamurthy 4/8/2019

19 Distribution of information in a routing overlay
A’s routing knowledge D’s routing knowledge C A D B Object: B’s routing knowledge C’s routing knowledge Node: B.Ramamurthy 4/8/2019

20 Basic programming interface for a distributed hash table (DHT)
put(GUID, data) The data is stored in replicas at all nodes responsible for the object identified by GUID. remove(GUID) Deletes all references to GUID and the associated data. value = get(GUID) The data associated with GUID is retrieved from one of the nodes responsible it. B.Ramamurthy 4/8/2019

21 Basic programming interface for distributed object location and routing
publish(GUID ) GUID can be computed from the object (or some part of it, e.g. its name). This function makes the node performing a publish operation the host for the object corresponding to GUID. unpublish(GUID) Makes the object corresponding to GUID inaccessible. sendToObj(msg, GUID, [n]) Following the object-oriented paradigm, an invocation message is sent to an object in order to access it. This might be a request to open a TCP connection for data transfer or to return a message containing all or part of the object’s state. The final optional parameter [n], if present, requests the delivery of the same message to n replicas of the object. B.Ramamurthy 4/8/2019

22 Overlay case studies: Pastry & Tapestry Routing
Prefix routing based on GUIDs Narrow search for next node along the route by applying a binary mask that selects an increasing number of hexadecimal digits from the destination GUID after each hop. B.Ramamurthy 4/8/2019

23 Figure 10.7: First four rows of a Pastry routing table
B.Ramamurthy 4/8/2019

24 Figure 10.8: Pastry routing example Based on Rowstron and Druschel [2001]
B.Ramamurthy 4/8/2019

25 Figure 10.9: Pastry’s routing algorithm
To handle a message M addressed to Node D (where R[p,i] is the element at column i of row p of the routing table): If (L-l < D< Ll ) { // the destination is within the leaf set of or is the current node. forward M to the element Li of the leaf set with GUID closest to D or current node A. } 2. Else { // use routing table to dispatch M to node with closer GUID 2a. Find p, the length of the longest common prefix of D and A. and i, the (p+1)th hexadecimal digit of D. 2b. If (R[p,i] ≠ null) forward M to R[p,i] // route it to node with longest common prefix 2c. Else { // there is no entry in the routing table 3a. Forward M to a node in L or R with common prefix length i, but GUID that is numerically closer. } } B.Ramamurthy 4/8/2019

26 Figure 10.10: Tapestry routing From [Zhao et al. 2004]
B.Ramamurthy 4/8/2019

27 Special Features of Tapestry
Implements DHT and routes messages based on GUIDs associated with resources using prefix routing in a manner similar to Pastry. 160 bit identifiers (uniform for resources as well as nodes) Periodic “publish” call caches the mapping in the routing table B.Ramamurthy 4/8/2019

28 Summary Napster: Central index server: request location, get location, download, update index (why?); need was for sharing music files (immutable); study Napster vs. copyright ownership legal issues (p.403); index servers are not distributed but replicated. Music files identified by GUID a secure hash code generated out of the features of the file. Next generation: Gnutella and Freenet: Distributed index; designed to meet the need for automatic placement and subsequent location of distributed objects; middleware providing API to the services; Distributed algorithm called routing overlay takes responsibility of locating nodes and objects; Pastry routing algorithm O(log16N): 10.9 and the related explanation. Host integration, host failure, locality (/redundancy), fault tolerance, dependability. Circular and linear address space. Compare this with IP address space? B.Ramamurthy 4/8/2019

29 References http://web.cs.ucla.edu/classes/cs217/05BitTorrent.pdf
B.Ramamurthy 4/8/2019


Download ppt "Peer-To-Peer Systems Chapter 10 B. Ramamurthy."

Similar presentations


Ads by Google