Download presentation
Presentation is loading. Please wait.
1
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California, San Diego
2
FRIENDS Motivation Related works Initial approaches Implementation Future work
3
Peer-to-peer File Sharing Illegal file sharing has drawn negative publicity P2P systems have nice features Low barrier of entry Aggregates computation and storage on large scale Robust, secure
4
Early P2P File Distribution Systems Napster (http://www.napster.com) Uses central database reliability issues Gnutella (http://www.gnutella.com) Uses broadcast Not scalable Kazaa, Grokster, Morpheus Hierarchy of nodes like DNS Bandwidth is not well distributed Reliability issues
5
More P2P File Distribution Systems Freenet [Clarke, 2000] Focus on anonymity, files get lost BitTorrent (http://www.bittorrent.com) Focus on distributing file to many peers Not fully decentralized, requires tracker
6
Related Works Recent popularity has resulted in lots of research in P2P Focus is on scalability, balance, flexibility Overlay Networks CAN [RATNASAMY, 2001] Chord [STOICA, 2001] Pastry [ROWSTRON, 2001] Tapestry [HILDRUM, 2002] Object Integrity unchecked
7
Related Works MD5 Replication Reputation Management Distributed EigenTrust [Kamvar, 2003] Distributed Hash Tables
8
Research Question How can a fully decentralized P2P file distribution network support the verification of object integrity in a hostile environment?
9
Initial Approach Global table of file names with hash values Not decentralized, not scalable Use time of object entry as indicator of file validity Malicious nodes can tamper with time Bootstrap to physically close node using IP address Avoid multiple malicious nodes connecting to each other Reduces overall bandwidth Scalability issues, limited view of network Use a single MD5 value per file Waste a lot of bandwidth downloading 99% of an invalid file
10
FRIENDS System Implementation Built using MACEDON MACEDON hides low level details allowing us to focus on higher level research questions and reducing code Built protocol on top of Chord Pros: Scalable, de-centralized, supports distributed index Cons: Overlay network lacks bandwidth optimizations Focus is on “proof of concept” rather than performance
11
FRIENDS System Details Verification needs to be done client size since anyone can be an adversary Each unique object in the system has a set of MD5 hash value associated with it Files are broken up into 512kB sized chunks, with an associated MD5 value There is also an MD5 of all the MD5s which serves to reduce bootstrapping costs After downloading a portion of a file, run the object through the given hash function to verify the data is correct Reputation system limits downloads from malicious nodes
12
Distributed Hash Table Use a Distributed Hash Table to store the global table of object to hash value. Upon entering the system, the node will calculate the hash values of any new objects it wishes to add to the system Hashing “greenday basketcase” sends file information to node for “greenday” and node for “basketcase”
13
Can I trust you? Malicious users can target the system in a number of ways Incorrect routing – checks can be made to ensure progress Hosting invalid files – MD5 hashes ensure that the file being retrieved is the file desired Corrupted Hash Table – solved with replication across multiple nodes Malicious nodes can still affect a cooperative node by wasting its time and bandwidth
14
Blacklisting Keep a rating of nodes that have hosted downloaded objects. Each successfully hashed object increases that node’s rating (+1) Failed hash decreases it (-5) Filter potential sources against personal reputation records Friends trust friends Increase rating of nodes trusted by trusted nodes Exponential back-off
15
Future Work Replication Square-root replication [Lv, 2002]: Optimal number of replications = 1/ρ (Σ√q i ) 2 ; where q i is the request frequency of object I and ρ is the average number of replicas per site Bandwidth comparison tests using ModelNET Improve file name indexing Exploit locality to optimize bandwidth
16
Questions?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.