Peer-to-peer and agent-based computing Scalability in P2P Networks.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

Distributed Systems Architectures
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Performance in Decentralized Filesharing Networks Theodore Hong Freenet Project.
Scalable Routing In Delay Tolerant Networks
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.
Peer-to-peer and agent-based computing P2P Algorithms.
Peer-to-peer and agent-based computing Freenet. peer-to-peer and agent-based computing 2 Plan of lecture Freenet Architecture –Goals and Properties Searching.
Peer-to-peer and agent-based computing P2P Algorithms & Issues.
Peer-to-peer and agent-based computing Peer-to-Peer Computing: Introduction.
George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.
Peer-to-Peer and Social Networks An overview of Gnutella.
CP2073 Networking Lecture 5.
Chapter 1: Introduction to Scaling Networks
The Platform as a Service Model for Networking Eric Keller, Jennifer Rexford Princeton University INM/WREN 2010.
Distributed Hash Tables
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Countering DoS Attacks with Stateless Multipath Overlays Presented by Yan Zhang.
Scale Free Networks.
Differential Forms for Target Tracking and Aggregate Queries in Distributed Networks Rik Sarkar Jie Gao Stony Brook University 1.
Peter R. Pietzuch Peer-to-Peer Computing – or how to make your BitTorrent downloads go faster... Peter Pietzuch Large-Scale Distributed.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Introduction to Computer Administration Introduction.
25 seconds left…...
Chapter 10: The Traditional Approach to Design
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
The Small World Phenomenon: An Algorithmic Perspective Speaker: Bradford Greening, Jr. Rutgers University – Camden.
New Opportunities for Load Balancing in Network-Wide Intrusion Detection Systems Victor Heorhiadi, Michael K. Reiter, Vyas Sekar UNC Chapel Hill UNC Chapel.
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
2/66 GET /index.html HTTP/1.0 HTTP/ OK... Clients Server.
INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
P2P Topologies Centralized Ring Hierarchical Decentralized Hybrid.
Peer-to-Peer is Not Always Decentralized …when Centralization is Good Nelson Minar
1.Optimizing P2P Networks: Lessons learned from social networking a)Social Networks b)Lessons Learned c)Are P2P Networks Social?? d)Organizing P2P Networks.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
Expediting Searching Processes via Long Paths in P2P Systems 05/30 IDEA Lab.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Peer-to-peer: an overview Selo TE P2P is not a new concept P2P is not a new technology P2P is not a new technology Oct : first transmission.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
PSI Peer Search Infrastructure. Introduction What are P2P Networks? The term "peer-to-peer" refers to a class of systems and applications that employ.
FastTrack Network & Applications (KaZaA & Morpheus)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Advanced Computer Networks: Part 2 Complex Networks, P2P Networks and Swarm Intelligence on Graphs.
Topologies and behavioral properties of the network Yvon Kermarrec Based on tml.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
BitTorrent Vs Gnutella.
Peer-to-Peer Data Management
CHAPTER 3 Architectures for Distributed Systems
Peer-to-Peer Information Systems Week 6: Performance
Presentation transcript:

peer-to-peer and agent-based computing Scalability in P2P Networks

Plan of lecture Optimising P2P Networks: –Social networks and lessons learned –Are P2P networks social? –Organising P2P networks Peer Topologies –Centralised, Ring, Hierarchical & Decentralized –Hybrid: Centralised/Ring Centralised/Centralised Centralised/Decentralised –Reflector Nodes Gnutella Case Studies peer-to-peer and agent-based computing – wamberto vasconcelos 2

Social networks peer-to-peer and agent-based computing – wamberto vasconcelos 33 Stanley Milgram (Harvard professor ) – 1967 social networking experiment: How many hops would it take for messages to traverse US pop? Posted 160 letters to randomly chosen people in Omaha, Nebraska Boston Omaha Asked them to try to pass these letters to a stockbroker working in Boston, Massachusetts Rules: 1.use intermediaries they know on a first name basis, chosen intelligently! 2.make a note at each hop 42 letters made it !! Average of 5.5 hops Demonstrated the small world effect Proved that the social network of the United States is indeed connected with a path-length (number of hops) of around 6 – the 6 degrees of separation! Does this mean that it takes 6 hops to traverse 200 million people?

Lessons learned from experiment Social circles are highly clustered Few members have wide-ranging connections –Form a bridge between far-flung social clusters –Bridging plays critical role in bringing the network closer together –For example: 25% of all letters passed through a local shopkeeper 50% mediated by just 3 people Lessons learned –These people acted as gateways or hubs between the source and the wider world –A small number of bridges dramatically reduces the number of hops peer-to-peer and agent-based computing – wamberto vasconcelos 4

From social to computer networks Similarities between social/computer networks –People = peers –Intermediaries = hubs, gateways –Number of intermediaries = number of hops Are P2P networks special then? –P2P networks more like social networks than other types of computer networks because: Self-organising Ad-Hoc Employ clustering techniques based on prior interactions (like we form relationships) Decentralised discovery & communication (similar to neighbourhoods, villages, etc) peer-to-peer and agent-based computing – wamberto vasconcelos 5

P2P: whats the problem? How to organise peers in ad-hoc, multi-hop pervasive P2P networks? –Network of self-organising peers organised in a decentralised fashion –Networks may expand rapidly from few hundred to several thousand (or even millions) of peers peer-to-peer and agent-based computing – wamberto vasconcelos 6

P2P: whats the problem? (Contd) P2P Environments: –Unreliable –Peers connect/disconnect – network failures –Random failures (e.g. power outages, DSL failure) –Personal machines more vulnerable than servers –Algorithms must cope with continuous restructuring of the network core P2P systems –Must treat failures as normal occurrences not freak exceptions –Must be designed to promote redundancy Tradeoff with degradation of performance peer-to-peer and agent-based computing – wamberto vasconcelos 7

How do we organise P2P networks? Organisation aimed at optimum performance NOT abstract numerical benchmarks!! –How many milliseconds will it take to compute this many millions of calculations? Rather, it means asking questions like: –How long will it take to retrieve this particular file? –How much bandwidth will this query consume? –How many hops will it take for my package to get to a peer on the far side of the network? –If I add/remove a peer to the network will the network still be fault tolerant? –Does the network scale as we add more peers? peer-to-peer and agent-based computing – wamberto vasconcelos 8

Performance issues in P2P networks 3 main features that make P2P networks more sensitive to performance issues 1.Communication –Fundamental necessity –Users connected via different connections speeds –Multi-hop 2.Searching –No central control so more effort needed –Each hop adds to total bandwidth (w/ time outs!) 3.Equal Peers –Freeriders – unbalance in harmony of network –Degrades performance for others –Need to get this right to adjust accordingly peer-to-peer and agent-based computing – wamberto vasconcelos 9

P2P topologies Core Centralised Ring Hierarchical Decentralised Hybrid Centralised-Ring Centralised-Centralised Centralised-Decentralised peer-to-peer and agent-based computing – wamberto vasconcelos 10

Centralised Client/server Web servers Databases Napster search Instant Messaging peer-to-peer and agent-based computing – wamberto vasconcelos 11

Ring Fail-over clusters Simple load balancing Assumption –Single owner peer-to-peer and agent-based computing – wamberto vasconcelos 12

Hierarchical Tree structure DNS Usenet (sort of) peer-to-peer and agent-based computing – wamberto vasconcelos 13

Decentralised Gnutella Freenet Internet routing peer-to-peer and agent-based computing – wamberto vasconcelos 14

Centralised + Ring Robust web applications High availability of servers peer-to-peer and agent-based computing – wamberto vasconcelos 15

Centralised + Centralised N-tier apps Database heavy systems Web services gateways Google.com uses this topology to deliver their services peer-to-peer and agent-based computing – wamberto vasconcelos 16

Centralised + Decentralised New generation of P2P Clip2 Gnutella Reflector FastTrack –KaZaA –Morpheus Possibly like social networks? peer-to-peer and agent-based computing – wamberto vasconcelos 17

Reflector nodes Known as super peers –AKA rendezvous peers (JXTA) Cache file list of connected users –Maintain an index When a query is issued, reflector node –Does NOT retransmit it –Answers the query using its own information peer-to-peer and agent-based computing – wamberto vasconcelos 18 F1.mp3 – ID0:F1.mp3 … CF1.mp3 F2.mp3 F3.mp

Napster = Gnutella? peer-to-peer and agent-based computing – wamberto vasconcelos 19 Napster Gnutella User Napster.com Gnutella Super Peers: 1. Natural?? 2. Reflector =? User Napster N2 N3 Napster Duplicated Servers

Gnutella network today Topology of a Gnutella network –From LimeWire site (popular Gnutella client) Notice power-law or centralised-decentralised structure peer-to-peer and agent-based computing – wamberto vasconcelos 20

Another view of Gnutella peer-to-peer and agent-based computing – wamberto vasconcelos 21

Gnutella studies 1: free riding Two types of free-riding: 1.Download files but never provide any files for others to download 2.Users that have undesirable content peer-to-peer and agent-based computing – wamberto vasconcelos 22

Gnutella studies 1: free riding (Contd) Article: Free Riding on Gnutella, E. Adar and B.A. Huberman (2000): –22,084 of the 33,335 peers in the network (66%) of the peers share no files –24,347 or 73% share ten or fewer files –Top 1% (333 hosts) represent 37% of total files shared –20% (6,667 hosts) sharing 98% of files Findings: –Even without Gnutella reflector nodes, the Gnutella network naturally converges into a centralised + decentralised topology with the top 20% of nodes acting as super peers peer-to-peer and agent-based computing – wamberto vasconcelos 23

Gnutella studies 2: equal peers Studied Gnutella for one month –Noted an apparent scalability barrier when query rates went above 10 per second Why? –Gnutella query: 560 bits long Queries make up approximately ¼ of traffic –Each peer is connect to three peers: 560 × 10 × 3 = 16,800 bytes/sec ¼ of traffic, so total traffic 67,200 bytes/sec –56-K link cannot keep up with amount of traffic One node connected in the incorrect place can grind the whole network to a halt –This is why P2P networks place slower nodes at the edges peer-to-peer and agent-based computing – wamberto vasconcelos 24

Gnutella studies 3: communication Article: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ripeanu –Studied topology of Gnutella over several months Two important findings Two suggestions peer-to-peer and agent-based computing – wamberto vasconcelos 25

Gnutella studies 3: communication Findings: 1.Gnutella network shares the benefits and drawbacks of a power-law structure –Networks that organise themselves so that most nodes have few links & small number of nodes have many links –Unexpected degree of robustness when facing random node failures. –Vulnerable to attacks: removing a few super nodes can have massive effect on function of network as a whole. 2.Gnutella network topology does not match well with underlying Internet topology –Leads to inefficient use of network bandwidth peer-to-peer and agent-based computing – wamberto vasconcelos 26

Gnutella studies 3: communication Suggestions: 1.Use a software agent to monitor network and intervene by asking servents to drop/add links to keep the topology optimal. 2.Replace Gnutella flooding mechanism with a smarter routing and group communication mechanism peer-to-peer and agent-based computing – wamberto vasconcelos 27

What about other topologies? Centralised + Hierarchical? –Back end tree of information –Caching architectures Decentralised + Ring? –P2P network of fail-over clusters What else? peer-to-peer and agent-based computing – wamberto vasconcelos 28

Further reading Chapter of textbook Free Riding on Gnutella, E. Adar and B.A. Huberman (2000), First Monday 5(10) Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ripeanu Distributed Hash table models –Pastry: –Chord: peer-to-peer and agent-based computing – wamberto vasconcelos 29