Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project
2 Styles of collaboration u Centralized model l e.g. Napster l global index held by central authority (single point of failure) l direct contact between requestors and providers u Decentralized model l e.g. Freenet, Gnutella l no global index – local knowledge only (approximate answers) l contact mediated by chain of intermediaries
3 Key questions u Does it work? l can we find the data? l query success rates length of query paths u Does it scale? l logarithmic / linear / polynomial u Is it robust? l participants are unreliable l different failure modes possible
4 An abstract model u Can model the network as a graph:
5 Querying the network u Answering a query means finding a path l source = requestor l destination = provider u A distributed search problem! l approximate global solution using local knowledge l same problem as IP routing
6 The Freenet algorithm u Graph structure actively evolves over time l new links form between nodes l files migrate through network adaptive routing
7 Initial simulations u Ring topology, 1000 nodes:
8 Initial simulations (contd)
9 Why does it work? u The small-world model l Milgram: six degrees of separation l Watts: between order and randomness n short-distance clustering + long-distance shortcuts
10 Links in the small world P(n) ~ 1/n 1.5 u Scale-free link distribution l P(n) = 1 /n k l most nodes have only a few connections l some have a lot of links
11 Small-world links (contd) u Real-world examples l movie actors (Kevin Bacon game) l world-wide web l nervous system of worm C. elegans
12 The importance of routing u Existence of short paths is not enough – they must be found u Adaptivity helps Freenet find good paths u Compare: a random-routing network
13 Scalability u Real-world networks are much larger l nearly 400,000 downloads of Freenet l 50 million Napster users u How well does Freenet scale?
14 Fault-tolerance u Unreliability is normal in peer-to-peer u Two types of failure: l random failure l targeted attack
15 Random failure
16 Targeted attack
17 To do u Variable disk/bandwidth capacity l if you build it, will they come? u Participants leaving and re-entering u File lifetimes l lifetime is relative l relationship between ease of retrieval and popularity, size l impact of splitting and combining
18 Conclusions u Local approximations can be good enough u Small-world model provides useful framework u Metrics to consider: l query pathlength l clustering coefficient l link distribution u Issues to consider: l scalability l fault tolerance under various scenarios
19 For more information u Performance chapter in Peer-to-Peer u I. Clarke, O. Sandberg, B. Wiley, T.W. Hong, Freenet: a distributed anonymous information storage and retrieval system, in Workshop on Design Issues in Anonymity and Unobservability, ed. by H. Federrath. Springer: New York (2001)