Introduction of P2P systems

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
An Introduction to Peer-to-Peer System
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Lecture 2 Distributed Hash Table
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 4 th edition. Jim Kurose, Keith Ross Addison-Wesley, July 2007.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
Malugo – a scalable peer-to-peer storage system..
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Peer-to-Peer Data Management
(slides by Nick Feamster)
EE 122: Peer-to-Peer (P2P) Networks
COSC 4213: Computer Networks II
#02 Peer to Peer Networking
Presentation transcript:

Introduction of P2P systems

What is P2P Systems Definition: Significantly autonomous from a centralized authority. Each node can act as a Client as well as a Server. Use the vast resources of machines at the edge of the internet. Storage, content, CPU power, Human presence. Resources at edge have intermittent connectivity, being added & removed. Infrastructure is untrusted and the components are unreliable.

Overlay Network A P2P network is an overlay network. Each link between peers consists of one or more IP links.

Overlay Graph Virtual edge Overlay maintenance TCP connection or simply a pointer to an IP address Overlay maintenance Periodically ping to make sure neighbor is still alive Or verify aliveness while messaging If neighbor goes down, may want to establish new edge New node needs to bootstrap Could be a challenge under high churn rate Churn --

More about Overlays Tremendous design flexibility Topology, maintenance Message types Protocol Messaging over TCP or UDP Underlying physical net is transparent to developer Unstructured overlays e.g., new node randomly chooses three existing nodes as neighbors Structured overlays e.g., edges arranged in restrictive structure

P2P Applications P2P File Sharing P2P Communications Napster, Gnutella, Kazaa, eDonkey, BitTorrent Chord, CAN, Pastry/Tapestry, Kademlia P2P Communications MSN, Skype, Social Networking Apps P2P Distributed Computing Seti@home

P2P File Sharing Alice runs P2P client application on her notebook computer Intermittently connects to Internet Gets new IP address for each connection Asks for “Hey Jude” Application displays other peers that have a copy of Hey Jude. Alice chooses one of the peers, Bob. File is copied from Bob’s PC to Alice’s notebook While Alice downloads, other users may upload from Alice. P2P P2P

P2P Communication Instant Messaging Skype is a VoIP P2P system Alice runs IM client application on her notebook computer Intermittently connects to Internet Gets new IP address for each connection Register herself with “system” Alice initiates direct TCP connection with Bob, then chats Learns from “system” that Bob in her buddy list is active P2P

P2P Distributed Computing seti@home Search for ET intelligence Central site collects radio telescope data Data is divided into work chunks of 300 Kbytes User obtains client, which runs in background Peer sets up TCP connection to central computer, downloads chunk Peer does FFT on chunk, uploads results, gets new chunk Not P2P communication, but exploit Peer computing power

Promising properties of P2P Massive scalability Autonomy : non single point of failure Resilience to Denial of Service Load distribution Resistance to censorship

What can be Issue? Management Lookup Throughput How to maintain the P2P system under high churn efficiently? Lookup How to find out the appropriate content/resource what a user wants? Throughput How to copy the content fast and efficiently?

Management Issue A P2P network must be self-organizing. Join and leave operations must be self-managed. The infrastructure is untrusted and the components are unreliable. The number of faulty nodes grows linearly with system size. Tolerance to failures and churn Efficient routing even if the structure of the network is unpredictable. Dealing with freeriders Load balancing

Napster Centralized Lookup Centralized directory services Step Connect to Napster server. Upload list of files to server. Give server keywords to search the full list with. Select “best” of correct answers. (ping) Bottleneck of the performance Lookup is centralized, but files are copied in P2P manner

Gnutella Fully decentralized lookup for files Unstructured P2P Flooding based lookup Obviously inefficient lookup in terms of scalability and bandwidth

Gnutella Scenario Step 0: Join the network Step 1: Determining who is on the network "Ping" packet is used to announce your presence on the network. Other peers respond with a "Pong" packet. Also forwards your Ping to other connected peers A Pong packet also contains: an IP address port number amount of data that peer is sharing Pong packets come back via same route Step 2: Searching Gnutella "Query" ask other peers if they have the file you desire A Query packet might ask, "Do you have any content that matches the string ‘Hey Jude"? Peers check to see if they have matches & respond (if they have any matches) & send packet to connected peers Continues for TTL (how many hops a packet can go before it dies ) Step 3: Downloading Peers respond with a “QueryHit” (contains contact info) File transfers use direct connection using HTTP protocol’s GET method

KaZaA Hierarchical approach between Gnutella and Napster Powerful nodes (supernodes) act as local index servers, and client queries are propagated to other supernodes. Two-layered architecture. Each supernode manages around 30-50 nodes More efficient lookup than Gnutella and more scalable than Napster

BitTorrent Sharing large volume of files faster and more efficiently Maximizing the utilization of bandwidth

BitTorrent : Pieces File is broken into pieces Piece selection Typically piece is 256 KBytes Upload pieces while downloading pieces Piece selection Select rarest piece Except at beginning, select random pieces Tit-for-tat Bit-torrent uploads to at most four peers Among the uploaders, upload to the four that are downloading to you at the highest rates A little randomness too, for probing MORE DETAIL…..EXAMPLES….

Structured P2P Peer-to-peer hash lookup: Node ID(Key) , Object ID(Key) Lookup(key)  IP address How does these route lookups? How does these maintain routing tables? insert (K1,V1) K V retrieve (K1) Chord, Pastry, Tepastry, Can, Kademlia, etc

Consistency Hashing Overlay network often has a known structure (ring) Each node has randomly chosen id Keys in same id space Node’s successor in circle is node with next largest id Each node knows IP address of its successor Object Key is stored in closest successor

Chord K5 N105 K20 Circular 7-bit ID space N32 N90 K80 Key 5 K5 Node 105 N105 K20 Circular 7-bit ID space N32 N90 K80 A key is stored at its successor: node with next higher ID

Chord N120 ½ ¼ N80 Finger i points to successor 112 N5 N120 N10 N110 K19 ¼ ½ N20 N99 N32 1/8 Lookup(K19) 1/16 1/32 1/64 N80 1/128 N80 N60 Finger i points to successor of n+2i Lookups take O(log(N)) hops

CAN 1 2 3 4 1 2 3 1 1 2 Virtual Cartesian coordinate space entire space is partitioned amongst all the nodes every node “owns” a zone in the overall space abstraction can store data at “points” in the space can route from one “point” to another point = node that owns the enclosing zone 1 2 3 4 1 2 3 1 1 2

CAN A node only maintains state for its immediate neighboring nodes O(n1/d) (a,b) (x,y)