Peer-To-Peer Networks BitTorrent vs. FreeNet Presentation By Bishara Khleif Comp 529.

Slides:



Advertisements
Similar presentations
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Advertisements

Scalable Content-Addressable Network Lintao Liu
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Incentives Build Robustness in BitTorrent Bram Cohen.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Modelling and Analysing of Security Protocol: Lecture 10 Anonymity: Systems.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Freenet A Distributed Anonymous Information Storage and Retrieval System Ian Clarke Oskar Sandberg Brandon Wiley Theodore W.Hong.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Paul Solomine Security of P2P Systems. P2P Systems Used to download copyrighted files illegally. The RIAA is watching you… Spyware! General users become.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Protecting Free Expression Online with Freenet Presented by Ho Tsz Kin I. Clarke, T. W. Hong, S. G. Miller, O. Sandberg, and B. Wiley 14/08/2003.
Gnutella, Freenet and Peer to Peer Networks By Norman Eng Steven Hnatko George Papadopoulos.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Wide-area cooperative storage with CFS
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presentation by Theodore Mao CS294-4: Peer-to-peer Systems August 27, 2003.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
Privacy-Preserving P2P Data Sharing with OneSwarm -Piggy.
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
BitTorrent Presentation by: NANO Surmi Chatterjee Nagakalyani Padakanti Sajitha Iqbal Reetu Sinha Fatemeh Marashi.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
1 BitTorrent System Efrat Oune Bar-Ilan What is BitTorrent? BitTorrent is a peer-to-peer file distribution system (built for intensive daily use.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
CH2 System models.
An efficient secure distributed anonymous routing protocol for mobile and wireless ad hoc networks Authors: A. Boukerche, K. El-Khatib, L. Xu, L. Korba.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Presenter: Chris Grier ECE 598nb Spring 2006.
Freenet: A Distributed Anonymous Information Storage and Retrieval System Ian Clarke, Oskar Sandberg, Brandon Wiley,Theodore W. Hong Presented by Zhengxiang.
Freenet File sharing for a political world. Freenet: A Distributed Anonymous Information Storage and Retrieval System I. Clarke, O. Sandberg, B. Wiley,
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
Freenet “…an adaptive peer-to-peer network application that permits the publication, replication, and retrieval of data while protecting the anonymity.
1 University of California, Irvine Done By : Ala Khalifeh (Note : Not Presented)
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Freenet: Anonymous Storage and Retrieval of Information
Company LOGO Freenet By Yogesh Kalyani. OUTLINE  Introducing P2P  Overview of Freenet  Key’s  Protocol  Security Analysis  Improvement  Strength.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Netprog: Chat1 Chat Issues and Ideas for Service Design Refs: RFC 1459 (IRC)
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
An example of peer-to-peer application
Introduction to BitTorrent
BitTorrent Vs Gnutella.
CHAPTER 3 Architectures for Distributed Systems
Presentation by Theodore Mao CS294-4: Peer-to-peer Systems
The BitTorrent Protocol
Presentation transcript:

Peer-To-Peer Networks BitTorrent vs. FreeNet Presentation By Bishara Khleif Comp 529

OVERVIEW P2P NETWORKS INTRODUCTION BITTORRENT FREENET

INTRODUCTION TO P2P NETWORKS  Peer To Peer distributed file system allows peers to communicate through its routing protocol over the application layer.  There are two designation of nodes: Client and Server ►Client node searches the database for a specific file to download cost free, such as mp3. ►Server acts as an up loader when the it has the requested file.  Nodes have four generally distinguishable stages ►Newly joining node exchanges basic information with other nodes and declares its available files. (neighbor finding) ►P2P location protocol helps locate queried files and routing protocol route its messages to the appropriate nodes. (location and routing protocol) ►Upon searching success, it begins to download the requested file. ►When done the node announces departure and logs off.

P2P GRAPH

P2P PERFORMANCE  Performance is divided according to three categorize: 1) Resilience is measured in reference to query success to failure ratio. The query fails if the node operator is down or the link to it fails. 2) Query efficiency is measured in reference to the average of query numbers and average query path length. And depending on which message forwarding system used the message number, for flooding based forwarding would be bigger than single route message forwarding. 3) scalability is measured in how adaptable and expandable the system to new nodes.

PERFORMANCE GRAPH

P2P ARCHITECTURE  There are three type of P2P architecture: 1) Centralized structure such as Napster where all objects are indexed in a central server. Peers looking for certain file will find its location or information about it from the central server, suffers from single point failure. 2) Decentralized but unstructured such as Gnutella has a random topology and queries are executed hop by hop till success of failure. 3) Decentralized and structured system use Distributed Hashing Table (DHT) to locate the queried object.

P2P COMPARISON GRAPH

SEARCH ALGORITHM  Object searching algorithm divides into two section: Random and Deterministic object search.  When using random search, queries would be forwarded to randomly selected neighboring nodes. The query keep on hoping from one node to its next randomly selected neighboring node. There are two kinds of random selection 1)Uniform random selects its neighboring node arbitrary from within its neighboring group. 2)Proportional random selects the next neighboring hop in proportion to its connection degree. So the bigger the connection degree the larger the query hit ratio would be.  When using Non-Random query search, the query’s next hop is structury determined first.

BittTorrent Definition  Webster dictionary defines a Torrent as a violently flowing stream of water/lava  Concatenating Bit to Torrent one would get a violently flowing stream of bits and that, I think, why the founder decided on naming it this way since its main purpose is to exchange rapidly large files between other peers at minimal cost.

Popularity  It’s popular because it feeds off the users files and storage. ►Traditional p2p file distribution system such as Napster had a file server that catered to its users that was costly and when failed service was disrupted. ► Popularly requested files congested the traffic and thus lacked effectiveness.

SOLUTION  Create a system that duplicates a large file among its users and have them upload & download to each other simultaneously and freely.  Make it effective so it can carry on a large amount of users to facilitate a rapid exchange of shared files.  Make it robust so that it does not have a single point of failure and disrupt its operation.  Make it user friendly so it can be adopted quickly by all level users.  Make it perform over an existing hardware that is easily accessible to everyone

PEERS CLASSIFICATION  Peers are categorized into two types ► Seeds which are peers that have already completed downloading a copy of the file. Now they can help replicate and distribute it among other active users. ► Leechers which are in the process of leeching the file off the system.

BitTorrent Example

ARCHITECTURAL FEATURES  An efficient swarming technique that BitTorrent algorithm dictates manages to balance the rate of exchange between peers. ► the file is divided into equal chunks and every user declares which chunk he has. ► the chunks are downloaded in parallel at the same time from different users. ► uses rarest first chunk so that chunks that do not have high request rate get downloaded first to keep users in exchanging mode.

► File chunks are further broken down into smaller pieces to avoid transmission delay over TCP connection and thus insuring proper sequence of delivery. ► Further, leechers will not get to download a new chunk unless the whole sub-pieces of the previous chunk has been received as to avoid arbitrary and unorganized download of bits and pieces. ► If for some reason two peers experience slow upload they can instead upload to each other while getting a better downloading performance.

► enforces tit-for-tat (selflessness) policy where information exchange is encouraged. ► if a user download without uploading he will temporary be chocked, or denied uploading. ► The back bone of the system is based on a Tracker that keeps track of all the active peers and their activities without involvement in the file distribution process.

► seeds usually limit simultaneous leechers service to only 4 but rotate every 10 seconds when needed. ► To further enhance downloading ability, seeds download to leechers with high downloading rate so that they in turn finish and become seeds to alleviate the system load. ► The older peer have a higher priority for downloading since he would have downloaded more chunks of the file than the newly comer and thus would have better chance in completing the download.

PERFORMANCE  The result of a five month study shows that on average down loaders achieved above 500kb/s which provided a clear understanding that most users have high bandwidth connection such as ADSL.  Although The chocking algorithm that advocates cooperation between peers is not part of the technical support of BitTorrent, it insures better reciprocating performance for all users having either high or low bandwidth.

GRAPH RESULT

EFFICIENCY  When BitTorrent realizes that one peer is downloading while not uploading, it will choke him and deny him service to insure efficiency of the system.  The goal of the algorithm is to have several active peers working together in a unidirectional way replicating and transferring files among themselves to insure rapid completeness.  The algorithm also adheres to globally optimize reciprocating peers connection as a whole.  It also allows users to suspend and resume downloading either by choice or by necessity. Its done by the tracker log that identifies the user by its session which consists of hashing his IP address with current time of download.

BItTORRENT INTERFACE  To start using bitTorrent you would download the latest version from a web sites that advocates it (lots).  Start the tracker and surf the web looking for a.torrent file to leech and maybe also seed.  The tracker exchange information between users over regular HTTP or HTTPS.  The tracker gets updated by each active user of its up or down loading progress.  The bitTorrent connections protocol between users is done over TCP.  Usually the one having the original copy is uploading only.

BitTorrent Coding  BitTorrent protocol uses bee encoding format between the Tracker and Metainfo file correspondence.  Message transfer that are be-encoded consist of nested dictionaries and lists that include strings and integers. ► Strings are be-encoded with length-prefixed base ten followed by a colon and the string for example 4:spam means simply spam ► Integers are be-encoded by representing them with an ‘i’ followed by a base ten number and an ‘e’. There is no size limitation on integers. for example i4e corresponds to number 4. ► Lists are be-encoded as an ‘l’ followed by their bencoded elements and an ‘e’. For example 14:spam4:eggse=spam,eggs ► Dictionaries are bencoded by ‘d’ followed by a list of alternating keys, their values and an ‘e’. For example, d3:cow3:moo4:spam4:eggse translates to {cow, moo, spam, eggse}

FREENET INTRODUCTION  FreeNet is another distributed P2P network that offers its users a cheap interface to exchange information.  However FreeNet advocates secrecy for it’s users that is beyond the censoring agency's reach.  Its protocol uses GUID (globally unique identifier) keys to disguise the file location and the requester/provider identity.  The use of these keys also prevent the storer of the file from accessing its content.

PRIVACY DRIVEN  Governments around the world have been eavesdropping and intercepting internet users messages. They justified these action to investigate criminals behaviors.  They campaign to limit access to internet content they do not approve off regardless of personal preference.  The founders of this network argued that the postal mail (anthrax) and/or payphones is another communication tool without censorship.  They also argued that the essence of any democratic society is its freedom of speech, good or bad.

DESIGN PURPOSE  To facilitate information exchange medium between information producing peers, information consuming peers and information holding peers without the prowling eyes of any governmental or other nosy agency regulation.  To decentralize the traditional one point failure central server for reliability and availability.  To advocate adaptability, scalability, efficiency, security and integrity in comparison to other P2P networks.

ACHITECTURE  The network information is stored on its members allocated storage devices.  Information storage is maintained by a Least Recent Requested file. ► Old files get replaced by new ones based on their frequency of use. If the allocated storage capacity of the information storer reaches its limit the least recently requested files get bumped for the new one. However if for some reason the file is suddenly being needed, the original provider could another copy. ► More popular nodes would have more copies than the newly added ones that are located further down the tree like structure because they frequented more often and their routing algorithm is better trained.  There two factors that influence file distribution:

► The first one is called tree pruning which like a garden tree with dying trunks get removed to enhance the growth of the rest of the tree. ► The second one is called tree growth where a popularly requested file gets copied to the specific area of which improves query response time and prevents overloading other nodes of that area.

GUIDE KEYS  There are two types of GUID keys CHK and SSK ► Content-Hash key (CHK) is created when hashing the content of the file to be stored. So when users look for a specific file the key gets the same hashed value an d a reference to its exact location. ► Signed-subspace key (SSK)- the subspace is created by first generating a private-public key pair to identify it and then hashing the public half of the subspace key and the descriptive string independently before concatenating then and hashing again. To insure integrity, the file is signed with its private key so that other nodes verifies its signature before accepting it. ► Only the creator of that particular file can modify it, the rest can only read it which insures trustworthy virtual space.

PRIVACY  Variation of Chaum’s mix-net scheme for private information exchange is used to guaranty anonymity of users. ► Messages hop from one node to the other while encrypting its link to avoid destiny detection. ► No one node can have information on all of the nodes but its immediate neighbor. ► Attackers would be prevented from corrupting the file from its keepers.

ROUNTING ALGORITHM  FreeNet message routing algorithm implements steepest-ascent hill-climbing search to avoid problematic centralized design used by other P2P like Napster Or the one used by Gnutella that broadcast its queries to every one connected. ►Nodes forward queries to others that have table entry in close proximity to the files assigned key. ►Queries bounce back & forth between nodes till it homes in the right one ►Nodes get trained and expand by collecting information from others querying it. ►Nodes also store cluster of files having similar keys because they search by specific key and not topic. And since topics are assigned hash keys, topic similarities will be dispersed among many nodes insuring against single point failures and thus system robustness. Study shows a median path length of just 8 hops within a network of 10,000 nodes.

FREENET FUNCTIONALITY  Each user maintains information about other users routing tables and possible addresses they have.  Every time a user receives a request for a file the users data base is checked first. If the file is not found, the user forwards the query to the next node having the closest possibility of having a similar key.  Users avoid identifying their addresses by faking a reply to themselves to deceive attackers.  Queries are given Time-To-Live (TTL) so it does not look forever and tie up system resources.  New files are inserted according to their GUID key values ► The receiving node checks to see if it already have and if not, it will facilitate storage for it and create a table entry for it.

 Data Encryption is not done automatically by the freeNet network. It is the responsibility of information publisher to do it.  Information publishers provide GUID keys to end user only to protect the information from being modified by information holders that would not be able to descramble the keys.  Newly added nodes are identified by their public-private keys. They get connected to other nodes by sending an announcement message.  The receiving node while adding it to its routing table propagate the information to the next node and so on till TTL expires. At that time the nodes new group assigns it a unique identifying GUID randomly and region.

PERFORMANCE  FreeNet is modeled after small world network that exhibit scalability and fault tolerance where significantly few nodes have huge connection to other nodes that enhances performance and create short path connection due to their vast connection.  The network eliminates poorly connected nodes that fail to respond to queries.  Robustness is majored against failure and unless the well connected nodes are taken out first the network will recover from malicious attacks from within the users or from an outsiders.

PERFORMANCE GRAPHS

FREENET SECURITY  Files to be stored on the system need to be protected just as the participant should be. If one impostor node can intercept and modify any files’ data, the system should be able to weed it out and restore the files’ data.  There are two levels of anonymity the system advocates 1) Senders and receivers should not be able to determine the sending node identity or the receiving node identity 2) Ranges of anonymity the participants would use to achieve their goal; absolute privacy – exposed.  All sent Messages have: 1) A randomly generated 64bit transaction ID that is uniquely generated and have a low collision rate over its live time. 2) A hops-to-live limit that is set by the sender and gets decremented with each passing node. The message does not stop when reaching its destination so to confuse a would be attacker and disguise the receivers identity. 3) A depth counter gets incremented at each hop and is initialized to small value to hide the messages’ true destination.

 There is no message protection between a sender and the first node contacted, that might be an eavesdropper. The best way around this problem is to own the first node on the edge of the network to be used as an entry point to it.  Messages can be repeatedly encrypted by its public key that specifies the path to take. And when it reaches its destination it get injected to the net as if it was coming from the receiver end.  Data originators have their identity hidden by the frequent resetting of the data source field in the routing table replies to requests. There would be no way to find out if the data holder is the originator or simply a forwarding node.

 To defend against file grabbing by an attacker, content-hash keys and subspace keys (CHK,SSK) should be used because unless the attacker can decipher the signature he will have a slim chance access it.  Denial-of-service attacks (DoS) can happen but will not have a significant effect because a scheme such as Hash Cash can counter it.  The network can also protect against such and attack by subdividing the data base into two section where new insertion do not get mingled with the old files.

QUESTIONS?