Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliability and Relay Selection in Peer- to-Peer Communication Systems Salman A. Baset and Henning Schulzrinne Internet Real-time Laboratory Department.

Similar presentations


Presentation on theme: "Reliability and Relay Selection in Peer- to-Peer Communication Systems Salman A. Baset and Henning Schulzrinne Internet Real-time Laboratory Department."— Presentation transcript:

1 Reliability and Relay Selection in Peer- to-Peer Communication Systems Salman A. Baset and Henning Schulzrinne Internet Real-time Laboratory Department of Computer Science Columbia University August 3 rd, 2010

2 2 Background

3 3 Peer-to-peer communication system P2P / PSTN gateway NAT / firewall network address of node B? (3) signaling (4) media network address of node E? (2) signaling (3) media node C node B media relay (or relay) node A node D node E (1) (2) node = user agent nodes form an overlay share responsibilities for message routing, signaling, media relaying super nodes, ordinary nodes (1) (2) (1) Reliability of p2p. comm systems? Relay selection techniques?

4 4 Motivation Reliability framework Reliability and Relay Selection Improving reliability of relayed calls User annoyance How many relays per call to achieve 99.9% success rate? Sources of unreliability in p2p comm. systems? How to quantify the interference of relayed calls with other applications? Outline Model for relayed calls Relay selection How to improve the reliability of relayed calls? How to find a relay in O(1) hop that minimizes latency and user annoyance?

5 5 Reliability framework Reliability=Proportion of completed calls (99.9%) Goal –understand reasons for call failure –devise techniques to improve them Reasons for call failure –(1) distributed search fails to find online callee DHT lookup –(2) distributed search fails to find a suitable relay DHT lookup or any appropriate relay selection scheme –(3) relay fails during voice/video session understand and improve reliability for relayed calls devise techniques for finding a relay

6 6 Motivation Reliability framework Reliability and Relay Selection How many relays per call to achieve 99.9% success rate? Model for relayed calls Outline

7 7 Understanding reliability of relayed calls Percentage of VoIP calls that need relaying –the provider knows –15-20% calls for a commercial client-server IM / VoIP application –341 relays in 20 days for Skype [Suh05Infocom] 17 per day for a super node (~50K super nodes) –Some client-server providers relay all calls –NAT studies

8 8 Understanding reliability of relayed calls For desired reliability, minimum relays per call? –let X i and R i lifetime and residual lifetime of a relay candidate (i.i.d.) –let D denote the call duration. –when i th relay fails, call is switched (i+1) st relay which is instantly selected from the global pool of all relays. Smallest k such that call completion prob. is greater than or equal to desired reliability k depends on the relationship b/w node lifetime and call duration 99.9% R1R1 RkRk R k-1 D 12K-1 k

9 9 Understanding reliability of relayed calls Min # of relays k 64 35 110 Min # of relays k Skype 12 hours (mean) 4 hours (med) 3 (mean call holding time = one hour) 95% of Skype relayed call durations – minimum of 3 relays to maintain 99.9% success rate 95% of Skype relay calls last less than 60 mins Exponential node lifetimes Skype node lifetimes lifetimes approximated as pareto Mean node lifetime Mean call duration What if the system does not have enough relays?

10 10 Motivation Reliability framework Reliability and Relay Selection Improving reliability of relayed calls Model for relayed calls How to improve the reliability of relayed calls? Outline

11 11 Improving reliability of relayed calls Approach 1 -- no-replacement –select k relays in the beginning of a call –do not replace failed relays Approach 2 -- with-replacement –select k relays in the beginning of a call –replace failed relays after μ –no failure during switch over –Skype uses 2-relay with- replacement scheme pure death process 210 2λ2λ λ 1-(λ + μ) 1-2λ μ [Bir04]

12 12 Improving reliability of relayed calls No-replacement – add more relays? –diminishing returns 1 vs. 2 vs. 3 vs. 4 MTTF 50% 22% 13% (exp) No-replacement (NR) vs. with-replacement (WR) –depends on mean lifetime, call duration, repair time Skype mean=12 hours Median=4 hours 2 relay with-replacement search time=60s

13 13 Motivation Reliability framework Reliability and Relay Selection Improving reliability of relayed calls User annoyance How to quantify the interference of the relayed call with other applications? Model for relayed calls Outline

14 14 User annoyance Interference of relayed call with other applications running on the relay machine File sharing = mutually beneficial (tit-for-tat) Relaying = altruistic Provide incentives or minimize user annoyance How to quantify user annoyance? –automatically? –spare network capacity Issues in measuring spare capacity? –bandwidth tests, ALTO

15 15 Motivation Reliability framework Reliability and Relay Selection Improving reliability of relayed calls User annoyance Model for relayed calls Relay selection How to find a relay in O(1) hop that minimizes latency and user annoyance? Outline

16 16 Distributed relay selection NAT IP addressRTTBandwidth IP addressRTTBandwidth Goal O(1) hop 2-level hierarchical network 1-relay close-by Give me a relay Here is a randomly selected relay local-random scheme search performancedropped calls

17 17 Distributed relay selection Delay User annoyance –interference with user applications –file sharing (draft idle peers) –spare capacity random mindelay –select relay with minimum delay netmax –select relay with maximum spare bw threshold –select relays with delay < 150 ms and maximum spare capacity Results –strategies perform similar near system collapse point –minimizing latency increases annoyance, number of jobs per relay, vice versa –threshold approach performs reasonably well

18 18 Related work Modeling –On lifetime-based node failure and stochastic resilience of decentralized peer-to-peer networks [Leonard09ToN] Minimizing churn –Minimizing churn in distributed systems [Godfrey06Sigcom] Relay selection –ASAP: an AS-aware peer relay protocol for high quality VoIP [Ren06ICDCS] $ diff this related_work –focus on node isolation –minimizing churn is not sufficient –reliability, relay selection, user annoyance

19 19 Conclusion Framework for analyzing reliability in p2p communication systems A model for reliability of relayed calls Reliability improvement schemes User annoyance Distributed relay selection


Download ppt "Reliability and Relay Selection in Peer- to-Peer Communication Systems Salman A. Baset and Henning Schulzrinne Internet Real-time Laboratory Department."

Similar presentations


Ads by Google