Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 10 January 2005.

Similar presentations


Presentation on theme: "1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 10 January 2005."— Presentation transcript:

1 1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 10 January 2005

2 2 Course Logistics  Time & place -MW 3:00 – 4:20 PM, SUR 15-300  Instructor -Mohamed Hefeeda -Office: SUR 15-260 -Office hours: MW 4:30 – 5:30 or by appointment -mhefeeda@cs.sfu.camhefeeda@cs.sfu.ca  Web page -www.cs.sfu.ca/~mhefeeda/Courses/05/P2P/www.cs.sfu.ca/~mhefeeda/Courses/05/P2P/

3 3 Course Objectives  In-depth study of the peer-to-peer computing paradigm, a very active research area in networking and distributed systems

4 4 Course Objectives (cont’d)  Learn how to effectively read, criticize, discuss, and present research papers  Learn how to search for and develop new research ideas  Learn how to write and defend your own work  Hopefully, you will find interesting ideas that either strengthen your current research, or help you jump start a new research path

5 5 Course Perquisites  Enthusiasm! -To read and explore new research ideas -To actively participate in the discussion  Some Computer Networks background -E.g., undergraduate course in networks or distributed systems -We will present the necessary concepts throughout the course

6 6 Course Load and Policy  Paper presentations 25% -One or more  Paper critique15% -For each presented paper, write a one-page summarizing: (1) contributions; (2) weaknesses, concerns, and flaws; and (3) suggestions for improvements - Due BEFORE the presentation - You may not submit up to 3 paper reviews  Class Participation10%

7 7 Course Load and Policy (cont’d)  Project 50% -On something related to P2P systems Theoretical (e.g., new algorithm), Measurement study, Performance analysis, Comparative study, Implementation and experimentation Survey  We will discuss possible projects -Propose your own project and get bonus points!

8 8 Research Facilities  Access to a wide area test bed composed of 400+ nodes distributed all over the world (PlanetLab)  Local area test bed (e.g., cluster of nodes, small LAN, …) can be arranged as well  Access to traffic logs and statistics from SFU and other institutions may be arranged (subject to use and privacy policies)  Most importantly, constructive critique and suggestions from your fellow students and the instructor

9 9 Course Schedule  Weeks 1-2 -Introduction to P2P systems (instructor) -End of week 2, discussion of possible projects  Weeks 3-11 -Paper presentations (students) -End of week 3, 1-2 page project proposal is due -In week 7, students will present the status of their projects (5-7 min talk for each student)  Weeks 12-13 -Project presentations and discussions -In week 12, project final report is due

10 10 Course Topics (tentative)  Introduction  Basic algorithms -Routing, overlay management, replication, …  Modeling and Analysis -Peer characteristics -Traffic analysis -System modeling  Security -Possible attacks -Trust management -Anonymity and Privacy

11 11 Course Topics (cont’d)  Rationality -Definitions -Incentive mechanisms to combat rationality -Designing incentive-compatible P2P protocols  Current and Potential Applications -File sharing -Storage and file systems -Distributed cycle sharing -Streaming and content distribution

12 12 Advice on Reading and Writing Papers  Jamin, Paper Reading and Writing ChecklistsPaper Reading and Writing Checklists  Hanson and McNamee, Efficient Reading of Papers in Science and TechnologyEfficient Reading of Papers in Science and Technology

13 13 Introduction to Peer-to-Peer Systems

14 14 P2P Computing: Definitions  Peers cooperate to achieve desired functions -Peers: End-systems (typically, user machines) Interconnected through an overlay network Peer ≡ Like the others (similar or behave in similar manner) -Cooperate: Share resources, e.g., data, CPU cycles, storage, bandwidth Participate in protocols, e.g., routing, replication, … -Functions: File-sharing, distributed computing, communications, content distribution, …  Note: the P2P concept is much wider than file sharing

15 15 Overlay Network

16 16 When Did P2P Start?  Napster (Late 1990’s) -Court shut Napster down in 2001  Gnutella (2000)  Then the killer FastTrack (Kazaa,...)  BitTorrent, and many others  Accompanied by significant research interest  Claim -P2P is much older than Napster!  Proof -The original Internet! -Remember UUCP (unix-to-unix copy)?

17 17 What IS and IS NOT New in P2P?  What is not new -Concepts!  What is new -The term P2P (may be!) -New characteristics of Nodes which constitute the System that we build

18 18 What IS NOT New in P2P?  Distributed architectures  Distributed resource sharing  Node management (join/leave/fail)  Group communications  Distributed state management  ….

19 19 What IS New in P2P?  Nodes (Peers) -Quite heterogeneous Several order of magnitudes difference in resources Compare the bandwidth of a dial-up peer versus a high-speed LAN peer -Unreliable Failure is the norm! -Offer limited capacity Load sharing and balancing are critical -Autonomous Rational, i.e., maximize their own benefits! Motivations should be provided to peers to cooperate in a way that optimizes the system performance

20 20 What IS New in P2P? (cont’d)  System -Scale Numerous number of peers (millions) -Structure and topology Ad-hoc: No control over peer joining/leaving Highly dynamic -Membership/participation Typically open  -More security concerns Trust, privacy, data integrity, … -Cost of building and running Small fraction of same-scale centralized systems How much would it cost to build/run a super computer with processing power of that 3 Million SETI@Home PCs?

21 21 What IS New in P2P? (cont’d)  So what?  We need to design new lighter-weight algorithms and protocols to scale to millions (or billions!) of nodes given the new characteristics  Question: why now, not two decades ago? -We did not have such abundant (and underutilized) computing resources back then! -And, network connectivity was very limited

22 22 Why is it Important to Study P2P?  P2P traffic is a major portion of Internet traffic (50+%), current killer app  P2P traffic has exceeded web traffic (former killer app)!  Direct implications on the design, administration, and use of computer networks and network resources -Think of ISP designers or campus network administrators  Many potential distributed applications

23 23 Sample P2P Applications  File sharing -Gnutella, Kazaa, Napster, …  Distributed cycle sharing -SETI@home, Gnome@home, …  File and storage systems -OceanStore, CFS, Freenet, Farsite, …  Media streaming and content distribution -PROMISE -SplitStream, CoopNet, PeerCast, Bullet, Zigzag, NICE, …

24 24 P2P vs its Cousin (Grid Computing)  Common Goal: -Aggregate resources (e.g., storage, CPU cycles, and data) into a common pool and provide efficient access to them  Differences along five axes [Foster & Imanitchi 03] -Target communities and applications -Type of shared resources -Scalability of the system -Services provided -Software required

25 25 P2P vs Grid Computing (cont’d) IssueGridP2P Communities and Applications  Established communities, e.g., scientific institutions  Computationally- intensive problems  Grass-root communities (anonymous)  Mostly, file- swapping Resources Shared  Powerful and Reliable machines, clusters  High-speed connectivity  Specialized instruments  PCs with limited capacity and connectivity  Unreliable  Very diverse

26 26 P2P vs Grid Computing (cont’d) IssueGridP2P System Scalability  Hundreds to thousands of nodes  Hundreds of thousands to Millions of nodes Services Provided  Sophisticated services: authentication, resources discovery, scheduling, access control, and membership control  Members usually trust others  Limited services: resource discovery  limited trust among peers Software required  Sophisticated suit: e.g., Globus, Condor Simple: (screen saver), e.g., Kazza, SETI@Home

27 27 P2P vs Grid Computing: Discussion  The differences mentioned are based on the traditional view of each paradigm -In the future, it is conceived that both paradigms will converge and will complement each other [e.g., Butt et al. 03]  Target communities and applications -Grid: is going open  Type of shared resources -P2P: is to include various and more powerful resources  Scalability of the system -Grid: is to increase number of nodes  Services provided -P2P: is to provide authentication, data integrity, trust management, …

28 28 P2P Systems: Simple Model P2P Substrate Operating System Hardware Middleware P2P Application Software architecture model on a peer System architecture: Peers form an overlay according to the P2P Substrate

29 29 Overlay Network  An abstract layer built on top of the physical network  Neighbors in the overlay can be several hops away in the physical network  Why do we need overlays? -Flexibility in Choosing neighbors Forming and customizing topology to fit application needs (e.g., short delay, reliability, high BW, …) Designing communication protocols among nodes -Get around limitations in legacy networks -Enable new (and old!) network services

30 30 Overlay Network (cont’d)

31 31 Overlay Network (cont’d)  Some applications that use overlays -Application level multicast, e.g., ESM, Zigzag, NICE, … -Reliable inter-domain routing, e.g., RON -Content Distribution Networks (CDN) -Peer-to-peer file sharing  Overlay design issues -Select neighbors -Handle node arrivals, departures -Detect and handle failures (nodes, links) -Monitor and adapt to network dynamics

32 32 Overlay Network (cont’d) IP Multicast

33 33 Overlay Network (cont’d) Application Level Multicast (ALM)

34 34 Peer Software Architecture Model  A software client installed on each peer  Three components: -P2P Substrate -Middleware -P2P Application P2P Substrate Operating System Hardware Middleware P2P Application Software architecture model on a peer

35 35 Peer Software Architecture Model (cont’d)  P2P Substrate (key component) -Overlay management Construction Maintenance (peer join/leave/fail and network dynamics) -Resource management Allocation (storage) Discovery (routing and lookup)  Can be classified according to the flexibility of placing objects at peers

36 36 P2P Substrates: Classification  Structured (or tightly controlled, DHT) −Objects are rigidly assigned to specific peers −Looks like as a Distributed Hash Table (DHT) −Efficient search & guarantee of finding −Lack of partial name and keyword queries −Maintenance overhead −Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)  Unstructured (or loosely controlled) −Objects can be anywhere −Support partial name and keyword queries −Inefficient search & no guarantee of finding −Some heuristics exist to enhance performance −Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03]

37 37 Peer Software Architecture Model (cont’d)  Middleware -Provides auxiliary services to the P2P application, e.g., Peer selection Trust management Data integrity validation Authentication and authorization Membership management Accounting (Economics and rationality) … -Ex: CollectCast, EigenTrust, Micro payement

38 38 Peer Software Architecture Model (cont’d)  P2P Application -Potentially, there could be multiple applications running on top of a single P2P substrate -Applications include File sharing File and storage systems Distributed cycle sharing Content distribution -This layer provides some functions and bookkeeping relevant to the target application File assembly (file sharing) Buffering and rate smoothing (streaming)  Ex: Promise, Bullet, CFS, Gnutella, Kazaa

39 39 Outline of the Rest of the Introduction  P2P Substrates -Structured (DHT) Example: CAN -Unstructured Example 1: Gnutella Example 2: Kazaa  Middleware and P2P Application -Example: CollectCast and Promise  Course Roadmap: -Papers flash overview (1-2 min each!)  Project discussion

40 40 Summary  In P2P computing paradigm: -Peers cooperate to achieve desired functions  Started (or re-discovered) with Napster ’98  Old, well-researched distributed concepts  BUT, with new characteristics (e.g., heterogeneity, unreliability, rationality, scale, ad hoc), new and lighter-weight algorithms are needed  Simple model for P2P systems: -Peers form an abstract layer called overlay -A peer software client may have three components P2P substrate, middleware, and P2P application Borders between components may be blurred  Next lecture: Structured P2P substrates (DHTs)


Download ppt "1 School of Computing Science Simon Fraser University CMPT 880: Peer-to-Peer Systems Mohamed Hefeeda 10 January 2005."

Similar presentations


Ads by Google