P2P Networks Introduction

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Application Layer Overlays IS250 Spring 2010 John Chuang.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Peer-to-peer file-sharing over mobile ad hoc networks Gang Ding and Bharat Bhargava Department of Computer Sciences Purdue University Pervasive Computing.
A Study on Mobile P2P Systems Hongyu Li. Outline  Introduction  Characteristics of P2P  Architecture  Mobile P2P Applications  Conclusion.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
P2P Network Structured Networks (IV) Distributed Hash Tables Pedro García López Universitat Rovira I Virgili
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
P2P Network Structured Networks: Distributed Hash Tables Pedro García López Universitat Rovira I Virgili
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Peer-to-Peer and Collective Intelligence A platform for collaboration Andrew Roczniak Collective Intelligence Lab Multimedia Communications Research Lab.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Peer to Peer Network Design Discovery and Routing algorithms
P2P Networks Introduction Pedro García López Universitat Rovira I Virgili
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Bruce Hammer, Steve Wallis, Raymond Ho
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Malugo – a scalable peer-to-peer storage system..
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer Information Systems Week 12: Naming
Peer-to-Peer Data Management
CHAPTER 3 Architectures for Distributed Systems
Early Measurements of a Cluster-based Architecture for P2P Systems
Distributed Hash Tables
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

P2P Networks Introduction Pedro García López Universitat Rovira I Virgili Pedro.garcia@urv.net

Index Goals Professor Information Course Outline Evaluation Introduction to P2P Networks

Goals It is an optional course focused on recent and ongoing research work in P2P networks. It is a seminar course and active class-room and laboratory participation is expected. Each student is required to read and present research papers. The course is designed to encourage everyone to actively learn advanced concepts, to independently think over research and development issues, to pro- actively relate what we learn to the real problems in practice, to stimulate and brain-storm new ideas, to intelligently solve pressing problems in various phases of P2P technologies. A key goal is to understand key P2P algorithms and infrastructures, and to test them in simulation (PlanetSim Overlay simulator) and real network settings (PlanetLab).

Professor Information Pedro García Lopez, PhD Office 238, ETSE, DEIM Mail: pedro.garcia@urv.net Web: http://www.etse.urv.es/~pgarcia/ MSN: tutorias_etse@hotmail.com Marc Sanchez Artigas, PhD Student Lab 144 Mail: marc.sanchez@urv.net Web: http://ants.etse.urv.es/marc/

Professor timetable Office 238 MSN Monday: 16:00 -19:00 Tuesday: 12:00 -13:00 & 16:00 – 19:00 MSN tutorias_etse@hotmail.com

Evaluation No final exam !! Task 1: Simulator problem assigment Task 2: 20 minutes presentation and study of a selected research paper Task 3: Final assigment to select between Development of a P2P application (proposed by professor or students) Research assignment: algorithm study in the simulator P2P technology evaluation in the PlanetLab network (Emule, Gnutella, Bittorrent) Misc: Two or three Moodle tests Note: Many lab hours will be available for developing the assigments. Note: Attending to classroom is not mandatory. Everything will be in MOODLE (distance learning).

Course Outline Introduction to P2P Networks Unstructured and Hybrid Networks Structured Networks (1) Structured Networks (2) Hierarchical Networks and Topology aware overlays P2P content Search and retrieval Security in P2P networks Applications and P2P Middleware platforms

Course Planning Introduction to P2P Networks Unstructured and Hybrid Networks Structured Networks (1) Structured Networks (2) (Task 1) Hierarchical Networks and Topology aware overlays P2P content Search and retrieval Security in P2P networks Applications and P2P Middleware platforms -> 15 Student presentations (Task2) Task 3: End of the course

Tools: PlanetSim Overlay network simulator Implemented in Java Documentation and tutorials in Labs. Layered architecture (decouple p2p protocols from applications) Advanced visualization output (Pajek, GML)

PlanetLab optional task 3 PlanetLab Worldwide Network 513 nodes over 240 sites Linux machines

Task 3: P2P Applications Possible Platforms: Java, FreePastry Implementation Java JXTA Java Gnutella Java Bamboo C++ Symphony Implementation C# Chord Implementation Python Chord Implementation [Your preferred language] [Your protocol]

Task 3 Research Task Supervised research tasks (simulation): Ant algorithms Unstructured networks Less structured networks CAST protocols Range queries Protocol comparisons Protocol Implementation Simulator Improvements Proximity network overlays Hierarchical networks

Bibliography Peer-to-peer-Systems and Applications. Ralf Steinmetz. Springer LNCS 3485. Peer-to-peer, Harnessing the Benefits of disruptive technology. Andy Oram. O’Reilly. Sistemas Distribuidos. Conceptos y Diseño. George Colouris. Addison Wesley

Interesting links P2P Readings DHT links IPTPS Conferences IEEE P2P Conferences IEEE ICDCS Conferences ACM SIGCOMM Conferences Other conferences: SIGCOM, INFOCOM, NSDI, …

Introduction to P2P Student A: OOH, but I thought Xarxes P2P Course was about downloading files from Emule !! Sorry, this course is about distributed systems ;-)

What is Peer 2 Peer ? A distributed system architecture Decentralized control Typically many nodes, but unreliable and heterogeneous Nodes are symmetric in function Take advantage of distributed, shared resources (bandwidth, CPU, storage) on peer-nodes Fault-tolerant, self-organizing Operate in dynamic environment, frequent join and leave is the norm

P2P Characteristics Decentralization Scalability -> avoid hotspots and bottlenecks! Ad-hoc connectivity Self-organization Local information Fault-resilience And also: anonimity, security, trust, replication, caching, coordination, …

What is NOT P2P ? A centralized system Problems: Client/Server Architecture Master/Slave models Dumb terminals / Active broadcaster Problems: Scalability Bottleneck (hotspot) Cost

Relevance of P2P According to several internet service providers, more than 50% of Internet traffic is due to P2P applications, sometimes even as much as 75%. It is a paradigm shift from coordination to cooperation, from centralization to decentralization, and from control to incentives

Why P2P is important (for engineers)? Workstations have considerable computing power (Moore's Law) Workstations provide huge data storage capacity The available bandwith is increasing steadily Edge computing: benefit from the power at the edges of Internet Decentralized applications can scale better (millions of users) Nirvana: Low cost multiuser applications with endless resources

Why P2P is important (for the society)? Organizational shift from hierarchical / centralized models to decentralized / network models “The Rise of the Network Society”, Manuel Castells P2P can enable the construction of worldwide communities Example 1. Teaching: classical lecture/exam model Vs collaborative learning Example 2. Mass media: big and few broadcasters (TV, radio) Vs public user publishing (Web blogs, net radio, net TV)

P2P Samples P2P killer applications: Buzzword: Grid Computing USENET (NNTP) Instant Messaging (?) Napster, Gnutella, eMule, Filetopia, Blubster (MP2P) SETI@HOME, LHC@home (BOINC) Groove (Collaboration), Ray Ozzie Buzzword: Grid Computing OGSA(Open Grid Services Architecture) Like P2P, it aims to leverage peer computing resources Widespread acceptance in many companies

P2P samples Problems ! Good samples [Slashdot 19/01/05] A Califorian bill introduced last week would, if passed, expose file-swapping software developers to fines of up to $2,500 per charge, or a year in jail, if they don't take 'reasonable care' to prevent their software from being used to commit crime P2P Worms and viruses Good samples Skype ! BitTorrent Peercast (P2P Radio) Share It! by Bringing P2P into the TV-Domain

Ideas !! P2P technologies will enable a next generation of innovative killer applications. P2P and collaboration technologies will promote the creation of worldwide user communities (social networks) P2P will permit low cost sophisticated content publishing and distribution tools (net radio, net tv)

History of P2P: prehistory 1969- ARPANET First killer applications like FTP and Telnet were Client/server 1979- USENET UUCP, NNTP Decentralized control, avoid network flood, path header, … DNS – 1983 Scalability !!! Hierarchical Design Distributed load Caching Query delegation

Internet Explosion (1995-1999) The switch to client/server HTTP, Chat, Mail, IM Breakdown of cooperation Spam, TCP rate equation, congestion Firewalls, dynamic IP, NATs Asymetric Bandwidth

History of P2P

P2P Infrastructures Overlay Network: Why Overlay Networks ? A network on top of a network (IP) Application level routers working in top of an IP network infrastructure Redundacy ? Why Overlay Networks ? Focused on application-level services that underlying layers do not offer One-to-Many & Many-to-Many apps, IP Multicast is not supported by most Internet routers IP substrate is difficult to change (IPv6 transition)

P2P Architectural models Centralized (Napster) Decentralized Unstructured (Gnutella) Structured (Chord) Hierarchical (MBone) Hybrid (EDonkey)

Unstructured P2P Networks Gnutella, JxTA, FreeNet Techniques: Flooding Replication & Caching Time To Live (TTL) Epidemics & Gossiping protocols Super-Peers Random Walkers & Probabilistic algorithms

Structured P2P Networks Nodes form a regular topology: Chord (Ring) Kademlia (Tree) CAN (Hypercube) Assure routing performance to locate contents O (log n) Every node has a unique id Distance metric over Ids Every node has a routing table with links to other nodes They are also called Distributed Hash Tables (DHTs)

Distributed Hash Tables A hash table links data with keys The key is hashed to find its associated bucket in the hash table Each bucket expects to have #elems/#buckets elements In a Distributed Hash Table (DHT), physical nodes are the hash buckets The key is hashed to find its responsible node Data and load are balanced among nodes key pos hash function 1 2 N-1 3 ... x y z lookup (key) → data insert (key, data) “Star Trek” Hash Table hash bucket h(key)%N 1 2 ... node key pos hash function lookup (key) → data insert (key, data) “Star Trek” h(key)%N N-1

Case Study (MIT’s Chord) Circular m-bit ID space for both keys and nodes Node ID = SHA-1(IP address) Key ID = SHA-1(key) A key is mapped to the first node whose ID is equal to or follows the key ID Each node is responsible for O(K/N) keys O(K/N) keys move when a node joins or leaves m=6 2m-1 0 N1 N8 N14 N32 N21 N38 N42 N48 N51 N56 K30 K24 K10 K38 K54

Chord’s State and lookup Finger table Each node knows m other nodes on the ring Successors: finger i of n points to node at n+2i (or successor) Predecessor (for ring management) O(log N) state per node Lookup is achieved by following closest preceding fingers, then successor O(log N) hops N8+1 N14 N8+2 N14 m=6 N8+4 N14 2m-1 0 N8+8 N21 N1 N8+16 N32 N8+32 N42 N56 lookup(K54) N8 K54 +1 +2 N51 +4 N14 N48 +8 +16 +32 N42 N21 N38 N32

DHTs problems Routing state maintenance algorithms (overhead) They do not work well when lots of nodes join and leave frequently (churn) Their structured topology and routing algorithms make them frail and vulnerable to security attacks. Network locality problem: Nodes numerically- close are not topologically-close Commercial implementations are slowly arising (it is still a research problem) Security !

DHT applications File sharing [CFS, OceanStore, PAST, …] Web cache [Squirrel, Coral] Censor-resistant stores [Eternity, FreeNet, …] Application-layer multicast [Narada, …] Event notification [Scribe] Streaming (SplitStream) Naming systems [ChordDNS, INS, …] Query and indexing [Kademlia, …] Communication primitives [I3, …] Backup store [HiveNet] Web archive [Herodotus] Suggestions ?

Conclusions Structured topologies and DHTs provide interesting infrastructures for the Next Generation of P2P Killer apps Chord’s commercial implementations are tough, but other DHTs can make it: Kademlia, Symphony, .. But, more research is needed to devise more robust, secure and efficient systems.

P2P Problems Tragedy of the commons Incentives !!! Tit-for-tat Heterogeneity Network proximity (hops !) Churn Security

Conclusions P2P killer application can emerge in the next years thanks to improvements in connectivity and computing power Success stories: Kazaa, Bittorrent, Skype, Groove But also local: MP2P, Filetopia Mobile P2P & Pervasive computing ?? Google is hiring P2P researchers !! New Ideas ??