Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management in Peer-to- Peer Systems Qi Sun Beverly Yang.

Similar presentations


Presentation on theme: "Data Management in Peer-to- Peer Systems Qi Sun Beverly Yang."— Presentation transcript:

1 Data Management in Peer-to- Peer Systems Qi Sun Beverly Yang

2 2 Introduction What is P2P? Distributed nodes Equal roles and functionality Providing/exchanging resources Why now? PCs are becoming valuable resources! Computing devices becoming pervasive

3 3 Many Applications Grid computing e.g., Seti-at-Home Ubiquitous computing Cell phones, wireless devices, hand helds Cars, refrigerators, microwaves Preservation/Archival systems File-sharing

4 4 File-sharing model Data: (Title string, File blob) Query: “Find songs by Madonna” Result: 63.274.18.3: Madonna – “Vogue” 63.274.18.3: Madonna – “Beautiful Stranger” 27.48.3.124: Madonna – “Like a Prayer” 17.64.75.18: Madanna – “Vogue” How is this “search” implemented?

5 5 Many Approaches Napster Gnutella KaZaA OverNet BitTorrent

6 6 Napster “Hybrid” P2P system Server Index Peers ? A B C D E F C,E,F

7 7 Napster Benefits Efficient Comprehensive Can handle complex queries Disadvantages Server is single point of failure Server is performance bottleneck Server costs money to maintain!!!

8 8 Gnutella “Pure” P2P system TCP “Overlay network”

9 9 Gnutella = forward query = processed query = source = found result = forward response

10 10 Gnutella Benefits No server needed (cost) Robust (nodes can come and go) Can handle complex queries per node Disadvantages Not comprehensive (can miss results) Inefficient! (many messages)

11 11 KaZaA “Super-peer” P2P system Index

12 12 KaZaA “Super-peer” P2P system Index ? Like Napster Like Gnutella

13 13 KaZaA Change the ratio of clients to super-peers Napster: everyone (minus one) is a client Gnutella: no one is a client Combines strengths of hybrid and pure systems Leverages heterogeneity of peers e.g., bandwidth, memory, processing power Napster: everyone (minus one) is a client Gnutella: no one is a client

14 14 OverNet Uses all peers to build a distributed index 0 - 10 6 10 6 – 2x10 6 7x10 6 – 8x10 6 X Z 3x10 6 – 4x10 6 Y W... ABC Hash(ABC) 3561246

15 15 OverNet: Searching Given key k, which peer has the index? 1 0 2 8 16 24 31 25 4 Peer 0 looking for k=25 Distributed Hash Table (DHT)

16 16 BitTorrent Downloading of a single file Tracker Blk1 Blk2 Blk3... Blk n Peers 2, 3, 6

17 17 BitTorrent: Downloading Tit-for-Tat strategy Choking Mechanism Periodic un-choke Rare blocks first A: 1,2,3,4 B: 3,5 C: 2,3,4 A: 1,2,3,4 B: 3 C: 4

18 18 Challenges Performance, Performance, Performance! Find rare/popular files quickly Minimize maintenance cost Spread workload evenly Etc. Zillions of heuristics/variants

19 19 Challenges (2) Participation: Peers are selfish! Do not want to “donate” bandwidth Do not want to share their files Do not care about others Need some incentive mechanism!!

20 20 Challenges (3) Authenticity of data How do you know you have the right file? Bogus copies Corrupt copies Need detection/correction mechanisms

21 21 Techniques Performance Routing Indices Network Awareness Participation SLIC Micropayments Correctness DoS Prevention Reputation Systems

22 22 Routing Indices ?

23 23 Routing Indices (2) 5 6 789 DB OS AI EE DB AI DBAI 10 11 12 13 DB 5 OS 5,6,7 1 DB 2,4 OS 2 AI 2,3,4 EE 3 DB? 2 AI 3 4 AI 8,9 EE 10 DB 11,13 AI 11,12 EE DB

24 24 Routing Indices (3) Benefits Potentially reduce # messages Drawbacks Update cost (any time you have state) Size of index

25 25 Reputation Systems Alice Bob Who has file X? I do! File Y

26 26 Reputation Systems Have a “opinion list” Base on personal experience? Problem: sparse Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 ? ? ? ? ? ?

27 27 Reputation Systems Node 4Node 1Node 2Node 6 Have a “trust list” Base on personal experience? Problem: sparse Ask friends Efficient Automatic

28 28 Micropayments Only if you have money, will people do things for you! Like a vending machine Goods are cheap Security can’t be too expensive Micropayments

29 29 $ Micropayments Server is needed… Handle accounts Distribute and cash coins Security Scalability and performance bottleneck

30 30 Micropayments Peers can do work too! Challenge: SECURITY $

31 31 SLIC: Link-based Incentive Use quality of service as incentive Fragment A Fragment B A B They need each other to reach more nodes.  Can retaliate

32 32 SLIC (2) DCB W(A,B) W(A,C) W(A,D) A Adjust weights, and use them to reward good neighbors and to penalize bad ones

33 33 Network Awareness Overlay network can be poor! San Francisco Palo Alto Timbuktu Mali, Africa

34 34 Network Awareness (2) Form only “good” links Probe a few and pick the best San Francisco Palo Alto Timbuktu Mali, Africa

35 35 Network Awareness (3) “Swap” peers around San Francisco Palo Alto Timbuktu Mali, Africa

36 36 Denial of Service Malicious peers can flood queries on unstructured networks Rate limit Incentive Micro-payment

37 37 Denial of Service Malicious peers can drop queries and indices in structured networks Tracing/Audit Reorganization Alternate path

38 38 Concluding Remarks P2P provides a cheap infrastructure for leveraging the capacities of the masses. P2P’s “openness” is both its strength and its weakness.


Download ppt "Data Management in Peer-to- Peer Systems Qi Sun Beverly Yang."

Similar presentations


Ads by Google