Presentation is loading. Please wait.

Presentation is loading. Please wait.

P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems.

Similar presentations


Presentation on theme: "P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems."— Presentation transcript:

1 P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems Karl Aberer EPFL-IC lsirwww.epfl.ch

2 P2P Systems - 2 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Overview 1.P2P Systems - Motivation 2.Unstructured P2P Overlay Networks 3.Hierarchical P2P Overlay Networks 4.Structured P2P Overlay Networks 5.Small World Graphs 6.Conclusions

3 P2P Systems - 3 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Web search engine –Global scale application Example: Google –200-300 Mio searches/day –4 10^9 Web pages Client-Server Information Systems 100000 processors 261000 disks 1 Find "aberer" 2 Result home page of Karl Aberer … Google Server Client Strengths –Global ranking –Fast response time Weaknesses –Infrastructure, administration, cost –A new company for every global application ?

4 P2P Systems - 4 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis (Semi-)Decentralized Information Systems P2P Music file sharing –Global scale application Example: Napster –1.57 Mio. Users –10 TeraByte of data (2 Mio songs, 220 songs per user) (February 2001) 1 Find "brick in the wall" "pink floyd" "1 MB" "rock" schema 2 Result you find f.mp3 at peer x 3 Request and transfer file f.mp3 from peer X directly Napster Server Peer PeerX 100 servers

5 P2P Systems - 5 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Lessons Learned from Napster Strengths: Resource Sharing –Every node “pays” its participation by providing access to its resources physical resources (disk, network), knowledge (annotations), ownership (files) –Every participating node acts as both a client and a server (“servent”): P2P –global information system without huge investment –decentralization of cost and administration = avoiding resource bottlenecks Weaknesses: Centralization –server is single point of failure –unique entity required for controlling the system = design bottleneck –copying copyrighted material made Napster target of legal attack Centralized System Decentralized System increasing degree of resource sharing and decentralization

6 P2P Systems - 6 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 2. Unstructured P2P Overlay Networks P2P file sharing –Global scale application Example: Gnutella –40.000 nodes, 3 Mio files (August 2000) Gnutella: no servers Strengths –Good response time, scalable –No infrastructure, no administration –No single point of failure Weaknesses –High network traffic –No structured search –Free-riding Find "brick in the wall" I have "brick_in_the_wall.mp3" …. Self-organizing System

7 P2P Systems - 7 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Self-Organization Self-organized systems well known from physics, biology, cybernetics –distribution of control ( = decentralization = P2P) –local interactions, information and decisions –emergence of global structures –failure resilience Self-organization in information systems –new hot topic in research –strongly motivated by P2P systems

8 P2P Systems - 8 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Connectivity in Gnutella Follows a power-law distribution: P(k) ~ k -g –k number of links a node is connected to, g constant (e.g. g=2) –distribution independent of number of nodes N –explanation: preferential attachment (self-organization process)

9 P2P Systems - 9 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 3. Hierarchical P2P Overlay Networks Dedicated servers provide index information, i.e. know which peer holds which file (Napster) Simplest Approach –one central server –user register files –service (file exchange) is organized as P2P architecture k="madonna" index server

10 P2P Systems - 10 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Superpeer Networks Improvement of Central Index Server (Morpheus, Kaaza) –multiple index servers build a P2P network –clients are associated with one (or more) superpeers –superpeers use message flooding to forward search requests

11 P2P Systems - 11 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 4. Structured P2P Overlay Networks Unstructured overlay networks – what we learned –simplicity (simple protocol) –robustness (almost impossible to “kill” – no central authority) Performance –search latency O(logN) –update cost low Drawbacks –tremendous bandwidth consumption Can we do better?

12 P2P Systems - 12 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis "Napster" bottleneck Search Trees Search tree: search keys are binary keys 000001010011100101110111 00?01?10?11? 0??1?? ??? index 101 ? ? ? ! peer 1peer 2peer 3peer 4

13 P2P Systems - 13 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Non-scalable Distribution of Search Tree Distribute search tree over peers 000001010011100101110111 00?01?10?11? 0??1?? ??? peer 1peer 2peer 3peer 4 bottleneck

14 P2P Systems - 14 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Scalable Distribution of Search Tree (P-Grid) 000001010011100101110111 00?01?10?11? 0??1?? ??? peer 1peer 2peer 3peer 4 Associate each peer with a complete path

15 P2P Systems - 15 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Routing Information 100101 10? 1?? ??? peer 1 peer 2 peer 3 peer 4 know more about this part of the tree knows more about this part of the tree

16 P2P Systems - 16 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Prefix Routing 11? 1?? ??? peer 4 peer 1 peer 2 peer 3 110111 100101 10? 1?? ??? peer 1peer 2 peer 3 peer 4 101 ? ? ? ? ! Message to peer 3 101 ? prefixpeer 0??peer1 peer2 10?peer3 routing table of peer4 search(p. k) find in routing table peer i with longest prefix matching k if last entry then found else search(peer i, k)

17 P2P Systems - 17 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Efficient Resource Location search cost maximal bandwidth update cost low high BROADCAST (e.g. Gnutella) SERVER (e.g. Napster) FULL REPLICATION STRUCTURED OVERLAY NETWORKS (e.g. prefix routing)

18 P2P Systems - 18 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 5. Small World Graphs Each overlay network can be interpreted as a directed graph –peers correspond to nodes –routing table entries as directed links Task –Find a decentralized algorithm (greedy routing) to route a message from any node A to any other node B with few hops compared to the size of the graph –Requires the existence of short paths in the graph

19 P2P Systems - 19 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Milgram’s Experiment Finding short chains of acquaintances linking pairs of people in USA who didn’t know each other; –Source person in Nebraska –Sends message with first name and location –Target person in Massachusetts. Average length of the chains that were completed was between 5 and 6 steps “Six degrees of separation” principle BIG QUESTION: –WHY there should be short chains of acquaintances linking together arbitrary pairs of strangers???

20 P2P Systems - 20 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Random Graphs For many years typical explanation was - random graphs –Vertices are selected uniformly at random –Low diameter: expected distance between two nodes is log k N (where k is the outdegree and N the number of nodes) But there are some inaccuracies –If A and B have a common friend C it is more likely that they themselves will be friends! (clustering) –Many real world networks (social networks, biological networks in nature, artificial networks – power grid, WWW) exhibit this clustering property –Random networks are NOT clustered.

21 P2P Systems - 21 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Clustering Clustering measures the fraction of neighbors of a node that are connected themselves Regular Graphs have a high clustering coefficient –but also a high diameter Random Graphs have a low clustering coefficient –but a low diameter Random Graph (k=4) Short path length L ~ log k N Almost no clustering C ~ k/n Regular Graph (k=4) Long paths L ~ n/(2k) Highly clustered C ~ 3/4

22 P2P Systems - 22 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Small-World Networks Random rewiring of regular graph (by Watts and Strogatz) –With probability p rewire each link in a regular graph to a randomly selected node –Resulting graph has high clustering and short path length BUT! Watts-Strogatz explains the structure of the graph –existence of short paths, high clustering It does not explain how the shortest paths are found –Gnutella networks are also small-world graphs

23 P2P Systems - 23 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis P2P Overlay Networks as Graphs Each structured overlay network can be interpreted as a directed graph … –peers correspond to nodes –routing table entries as directed links … embedded in some space –P-Grid: interval [0,1] –others: d-dimensional space –etc. Task –Find a decentralized algorithm (greedy routing) to route a message from any node A to any other node B with few hops compared to the size of the graph each node has a coordinate!

24 P2P Systems - 24 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Kleinberg’s Small-World Model Kleinberg’s Small-World’s model –Embed the graph into an r-dimensional grid –constant number of short range links (neighborhood) –q long range links: choose long-range links such that the probability to have a long range contact is proportional to 1/d r Importance of r ! –Decentralized (greedy) routing performs best iff. r = dimension of space r = 2

25 P2P Systems - 25 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Influence of “r” Given r = dim each long range contact of u is nearly equally likely to belong to any of the sets A i When q = logN – on average each node will have a link in each set of A i A i, consists of all nodes whose distance from u is between 2 i and 2 i+1, i=0..logN-1.

26 P2P Systems - 26 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Structured Overlay Networks and Kleinberg's model P-Grid’s model Kleinberg’s model

27 P2P Systems - 27 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis 6. Conclusions P2P systems –started out from some "hacker-type" applications –initiated lots of original research –basis for novel, highly scalable information systems

28 P2P Systems - 28 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Small Guide to Information Systems courses Introduction to information systems (Sem 6, mandatory) Basis of all: relational databases and Web technology, fun project Middleware (Masters) Everything you need in industry Distributed Information systems (Masters) Fun part: Web, P2P, Search Engines, and much more

29 P2P Systems - 29 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Masters Specialization "Internet Computing" INFNatural LanguageRajman et al6(4+2)SS INFAdvanced Databases Spaccapietra6(3+3)SS INFMultimedia DocumentsVanoirbeek6(4+2)SS ALGDistributed Inf. Sys.Aberer6(4+2)WS ALGIntelligent AgentsFaltings6(3+3)WS ALGDistributed algorithms: Schiper4(2+1)WS message passing SYSMiddlewareGuerraoui6(4+2)SS SYSPerformanceLeBoudec6(4+2)SS SYSMobile Networks Hubaux4(2+2)SS SYSCryptography and SecurityVaudenay6(4+2)WS HISHuman-Computer InteractionPu4(2+1)SS HISEnterprise ArchitectureWegmann6(4+2)WS HISE-BusinessPigneur6(4+2) WS 66 Credits Total


Download ppt "P2P Systems - 1 ©2005, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Information Systems: A Short Overview of Peer-2-Peer Systems."

Similar presentations


Ads by Google