MSCS6060 Parallel and Distributed Systems Peer-to-Peer Computing Rong Ge Some slides and figures are from www.list.gmu.edu and HPL survey.

Slides:



Advertisements
Similar presentations
© Ravi Sandhu Security Issues in P2P Systems Prof. Ravi Sandhu Laboratory for Information Security Technology George Mason University.
Advertisements

2/66 GET /index.html HTTP/1.0 HTTP/ OK... Clients Server.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
Introduction Widespread unstructured P2P network
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.

Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Peer-to-Peer Networking. Presentation Introduction Characteristics and Challenges of Peer-to-Peer Peer-to-Peer Applications Classification of Peer-to-Peer.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
Chapter 2: Application layer
Colin J. MacDougall.  Class of Systems and Applications  “Employ distributed resources to perform a critical function in a decentralized manner”  Distributed.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
FastTrack Network & Applications (KaZaA & Morpheus)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Peer-to-Peer and Collective Intelligence A platform for collaboration Andrew Roczniak Collective Intelligence Lab Multimedia Communications Research Lab.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Peer-to-Peer (P2P) Networks By Bongju Yu. Contents  What is P2P?  Features of P2P systems  P2P Architecture  P2P Protocols  P2P Projects  Reference.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
Peer-to-Peer Information Systems Week 12: Naming
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
EE 122: Peer-to-Peer (P2P) Networks
An Overview of Peer-to-Peer
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

MSCS6060 Parallel and Distributed Systems Peer-to-Peer Computing Rong Ge Some slides and figures are from and HPL survey

Outline What’re P2P technologies? Taxonomy P2P applications and services Research issues 2

3 Mainframe → Client-Server → P2P Mainframe era: – 1970’s – Dumb terminals connected to a big mainframe – Mainframes possibly networked together Client-server: – Late 1980’s – Many clients, 1 user per client – Dedicated servers – Single client can access multiple servers – Significant computing resources on client Peer-to-Peer (P2P) – Late 1990’s – Each computer is a client and a server – Takes on whatever role is appropriate for a given task at a given time – Harnesses computing and communication power of the entire network What do you think makes p2p increasingly common?

4 P2P versus Client-Server: Idealized View From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

5 No Clear Border From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

6 Hybrid P2P Systems From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

Peer-to-Peer Computing The individual nodes have symmetric roles. Each node may act as both a client and a server. (IRTF P2P research group) The participants share a part of their own hardware resources (processing power, storage capacity, network link capacity, printers, …) Individual computers communicate directly over the Internet without central entities The participants are resource (Service and content) providers as well as resource (Service and content) requestors (Servent-concept) R. Schollmeier, “A definition of peer-to-peer networking for the classification of peer-to-peer architectures and applications,” in Proc. of P2P’01, pp , Aug

Peer-to-Peer Is Not New P2P networking is not new-fashioned – Telephone – Usenet News in 1979 – DNS P2P is mostly known under the brand of Napster, the first file- sharing service 8

9 Napster From THE FUTURE OF PEER-TO-PEER COMPUTING, Loo, CACM Sept 2003

10 P2P Application Examples Napster – Music sharing Information (File) sharing – KaZaa, Gnutella – Morpheus, FreeNet, Grokster, … Distributed data processing – – – Popular Power Distributed applications – Distributed File system – DDoS

P2P Domainates Internet Traffic P2P has dominated Internet traffic Source: CacheLogic. In 2006, more than 60% of Internet traffic Since YouTube is based on HTTP, there is a growth in Web traffic in 2007.

Statistics of P2P Traffic 12

Some Statistics about P2P Systems More than 663 million users registered with skype, around 10 million on-line users. (2010) Around 4.7M hosts participate (2006) BT accounts for 1/3 of Internet traffic (2007) More than 200,000 simultaneous online users on PPLive. (2007) More than 3,000,000 users downloaded PPStream. (2008) 13

14 Taxonomy of Computer Systems From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

15 Why P2P? Get rid of Servers – Single point of failure, centralized control and management, access fee and management fee, … Clients are not so dumb – Billions of Mhz CPU, tons of terabytes disk, millions of gigabits network bandwidth, … P2P is about resource sharing – Flexible, efficient information sharing P2P changes the way of Web (Internet)

16 Taxonomy of P2P Systems From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

17 Classification of P2P Systems From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

18 Taxonomy of P2P Applications From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

19 Taxonomy of P2P Markets From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

20 P2P Markets versus P2P Applications From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

21 P2P System Architecture From Peer-to-Peer Computing, Milojicic et al, HP Laboratories, HPL , March 8th, 2002

Summary Distributed computing – Server/client – P2p P2p computing – Participants are both servers and clients Taxonomy of P2P computing – Systems – Applications – Architecture MSCS6060 Spring

MSCS6060 Parallel and Distributed Systems Peer-to-Peer Computing Cont’d Rong Ge Some slides and figures are from and HPL survey

Outline P2P system models and operations Challenges and issues in P2P MSCS6060 Spring

P2P Operations Operations in P2P systems consist of three phases – Peer discovery (bootstrap) Well-known nodes, cached peers, broadcasting, … – Resource discovery (search) Locate a resource given its identifier Central servers maintain index of all information Unstructured P2P networks use flooding Structured P2P networks use distributed hash table (DHT) – Communication or data transfer Direct communication, NAT/Firewall traversal

P2P System Models Centralized – Central indexing servers maintain a directory of shared data – Napster, Kuro, etc. Decentralized unstructured – Neither central directory server nor any precise control over network topology or data placement – Gnutella, Kazaa, etc. Decentralized structured – No centralized directory but shared data placement and topology characteristics of network are tightly controlled based on Distributed Hash Table (DHT) – CAN, Chord, Pastry, Tapestry, etc. Hierarchical Hybrid 26

Centralized P2P Utilize a central directory for object location For file-sharing P2P, location inquiry form central servers then downloaded directly from peers Benefits – Simplicity – Limited bandwidth usage Drawbacks – Unreliable (single point of failure), performance bottleneck, and scalability limits – Vulnerable to DoS attacks – Copyright infringement upload indexes 1. query 3. transfer Centralized Server 2. response

Unstructured P2P (1/2) Each request is flooded to directly connected peers, which then flood their neighbors – Until the request is answered or with a certain scope (TTL limit) Can be hierarchical – Supernode acts as a local central index for file shared by local peers and forwards queries to other supenodes Benefits – Decentralized, reliable, fault-tolerance, … Drawbacks – Excessive query traffic – Not scalable – The most critical is fail to find content that is actually in the system

Unstructured P2P (2/2) search transfer peer node supernode 1.query 2.query Flooded to connected peersFlooded between supernodes

Search To search for a file a node, say n, sends a search Query message to its neighbor nodes. On receiving a search Query, nodes look for a match in their local data set If a match is found a Hit message is generated which is sent back over the same path through which Query message came to the node Query message is forwarded further if TTL is not zero Download On receiving Hit messages node n selects a node to download the file The Downloads happen via a HTTP connection File Exchange over Gnutella

Search and Download (1)Query (2)Query (3)Query (4) Hit (5) Hit (6) Hit (7) Download Peer A Peer D Peer B Peer C

Structured P2P Each peer is assigned an ID and knows a given number of peers Each shared resource is assigned an hashed ID A request will be directed to the peer with the ID most similar to the resource ID using a Distributed Hash Table (DHT) Benefits – Scalable – More efficient searching Drawbacks – Routing table maintenance – Exact-match search

Distributed Hash Tables Hash table: (key, value) Responsibility for maintaining the mapping is distributed among the nodes Scalable, able to handle continual node arrivals, departures, and failures MSCS6060 Spring

BitTorrent seed peer BitTorrent Tracker uses DHT: a server assisting in the communication between peers BitTorrent index: a list of.torrent files including descriptions

Hierarchical P2P MSCS6060 Spring Peers can have different roles in groups superpeers peers The first c peers to join will be the superpeers in the group. A peer must contacts one superpeer when joining a group The superpeers forms an overlay network

Issues of P2P Search – Full index, partial index, Semantic search Flash crowd Free riding Topological awareness NAT traversal Fault resilience Security – Spurious content – Anonymity – Trust, Reputation Non-technical issue – Copyright infringement, intellectual piracy 36

P2P Search Algorithms How to search resource? – Centralized index model – Decentralized unstructured Flooded requests model Hierarchical model (Supernode) – Decentralized structured Document routing model, DHT-based routing Advanced issues – Keyword search – Semantic context search 37

Flash Crowd Definition – A sudden, unanticipated growth in demand of a particular object – This object may be cold previously or new released Issues – Overhead: how many query messages generated? – Speed: how long to find and download the object?

Free Riding Peers share little or no data in P2P file-sharing systems Measurement – Nearly 70% of Gnutella users share no files – Nearly 50% of all responses are returned by the top 1% of sharing hosts Incentive mechanisms to encourage user cooperation

Topological Awareness Peers choose neighbors without any knowledge about underlying physical topology can cause a serious topology mismatching between the P2P logical overlay network and the physical underlying network

Lessons for P2P System Designers Take the heterogeneity of the peers into account – Different peer should be delegated with different responsibility On-line measure performance of peers – Adapt to changes of peer status Fairness (incentive) – Encourage server-like peers and discourage client-like peers (free riders) with some resource management mechanisms.

42 Conclusion P2P may change the way of Web/Internet Lots of creative applications to be developed Expect a rapid growth in Internet traffic Still lots of problems – Illegal copies (copyright problem) – Security – Undesired traffic – …

References [1] R. Schollmeier, “A definition of peer-to-peer networking for the classification of peer-to-peer architectures and applications,” in Proc. of P2P’01, pp , Aug [2] A. Crespo and H. Garcia-Molina, “Routing indices for peer-to-peer systems,” in Proc. of 22nd Int’l Conf. on Distributed Computing Systems (ICDCS’02), pp , July 2002 [3] V. Kalogeraki, D. Gunopulos, and D. Zeinalipour-Yazti, “A local search mechanism for peer-to-peer networks,” in Proc. of 11th Int’l Conf. on Information and Knowledge Management (CIKM’02), pp. 300– 307, 2002 [4] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search and replication in unstructured peer-to-peer networks,” in Proc. of 16th ACM Int’l Conf. on Supercomputing (ICS’02), pp , New York, June 2002 [5] D. Tsoumakos and N. Roussopoulos, “Adaptive probabilistic search for peer-to-peer networks,” in Proc. of 3rd Int’l Conf. on Peer-to-Peer Computing (P2P’03), pp , 1-3 Sept [6] B. Yang and H. Garcia-Molina, ”Improving search in peer-to-peer networks”, in Proc. of 22nd Int’l Conf. on Distributed Computing Systems (ICDCS’02), pp. 5-14, 2002 [7] D. Zeinalipour-Yazti, V. Kalogeraki, and D. Gunopulos, “Information retrieval techniques for peer-to-peer networks,” Computing in Science & Engineering [see also IEEE Computational Science and Engineering], vol. 06, no. 4, pp , July-Aug 2004 [8] Yunhao Liu, Xiaomei Liu, Li Xiao, Lionel M. Ni, and Xiaodong Zhang, “Location-Aware Topology Matching in P2P Systems,” The 23rd Conference of the IEEE Computer and Communications Societies (INFOCOM’04), vol. 4, pp , 7-11 March 2004.

Search for ET intelligence Central site collects radio telescope data Data is divided into work chunks of 300 Kbytes User obtains client, which runs in background Peer sets up TCP connection to central computer, downloads chunk Peer does FFT on chunk, uploads results, gets new chunk According to a statistics in 2004 – Nearly 5 million participants in 226 countries – Nearly 2 million CPU years of work – Over 1.3 billion results received