Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002.

Similar presentations


Presentation on theme: "Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002."— Presentation transcript:

1 Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002

2 Service Composition: Motivation Provider Q Texttospeech Provider R Cellular Phone Email repository Provider A Video-on-demand server Provider B Thin Client Provider A Provider B Replicated instances Transcoder Service-Level Path Other examples: ICEBERG, IETF OPES’00

3 In this work: Problem Statement and Goals Goals –Performance: Choose set of service instances –Availability: Detect and handle failures quickly –Scalability: Internet-scale operation Problem Statement –Path could stretch across –multiple service providers –multiple network domains –Inter-domain Internet paths: –Poor availability [Labovitz’99] –Poor time-to-recovery [Labovitz’00] –Take advantage of service replicas

4 In this work: Assumptions and Non- goals Operational model: –Service providers deploy different services at various network locations –Next generation portals compose services –Code is NOT mobile (mutually untrusting service providers) We do not address service interface issue Assume that service instances have no persistent state –Not very restrictive [OPES’00]

5 Related Work Other efforts have addressed: –Semantics and interface definitions OPES (IETF), COTS (Stanford) –Fault tolerant composition within a single cluster TACC (Berkeley) –Performance constrained choice of service, but not for composed services SPAND (Berkeley), Harvest (Colorado), Tapestry/CAN (Berkeley), RON (MIT) None address wide-area network performance or failure issues for long-lived composed sessions

6 Solution: Requirements Failure detection/liveness tracking –Server, Network failures Performance information collection –Load, Network characteristics Service location Global information is required –Hop-by-hop approach will not work

7 Design challenges Scalability and Global information –Information about all service instances, and network paths in- between should be known Quick failure detection and recovery –Internet dynamics  intermittent congestion

8 Is “quick” failure detection possible? What is a “failure” on an Internet path? –Outage periods happen for varying durations Study outage periods using traces –12 pairs of hosts –Periodic UDP heart-beat, every 300 ms –Study “gaps” between receive-times Main results: –Short outage (1.2-1.8 sec)  Long outage (> 30 sec) Sometimes this is true over 50% of the time –False-positives are rare: O(once an hour) at most –Okay to react to short outage periods, by switching service-level path

9 Towards an Architecture Service execution platforms –For providers to deploy services –First-party, or third-party service platforms Overlay network of such execution platforms –Collect performance information –Exploit redundancy in Internet paths

10 Architecture Internet Service cluster: compute cluster capable of running services Peering: exchange perf. info. Destination Source Composed services Hardware platform Peering relations, Overlay network Service clusters Logical platform Application plane Overlay size: how many nodes? –Akamai: O(10,000) nodes Cluster  process/machine failures handled within

11 Key Design Points Overlay size: –Could grow much slower than #services, or #clients –How many nodes? A comparison: Akamai cache servers O(10,000) nodes for Internet-wide operation Overlay network is virtual-circuit based: –“Switching-state” at each node E.g. Source/Destination of RTP stream, in transcoder –Failure information need not propagate for recovery Problem of service-location separated from that of performance and liveness Cluster  process/machine failures handled within

12 Software Architecture Finding Overlay Entry/ExitLocation of Service Replicas Service-Level Path Creation, Maintenance, Recovery Link-State Propagation At-least -once UDP Perf. Meas. Liveness Detection Peer-Peer Layer Link-State Layer Service-Composition Layer Functionalities at the Cluster-Manager

13 Layers of Functionality Why Link-State? –Need full graph information –Also, quick propagation of failure information –Link-state flood overheads? Service-Composition layer: –Algorithm for service-composition Modified version of Dijkstra’s –Computational overheads? –Signaling for path creation, recovery Downstream to upstream

14 Evaluation What is the effect of recovery mechanism on application? What is the scaling bottleneck? –Overheads: Signaling messages during path recovery Link-state floods Graph computations Testbed: –Emulation platform –On Millennium cluster of workstations

15 Evaluation: Emulation Testbed Idea: Use real implementation, emulate the wide-area network behavior (NistNET) Opportunity: Millennium cluster App Lib Node 1 Node 2 Node 3 Node 4 Rule for 1  2 Rule for 1  3 Rule for 3  4 Rule for 4  3 Emulator

16 Evaluation: Recovery of Application Session Text-to-Speech application Two possible places of failure Leg-2 Leg-1 Texttoaudio Text Source End-Client Request-response protocol Data (text, or RTP audio) Keep-alive soft-state refresh Application soft-state (for restart on failure)

17 Evaluation: Recovery of Application Session Setup: –20-node overlay network Generate 6,510-node physical network using GT-ITM Choose 20 nodes at random –Latency variation: Base value of one-way latency from edge weights Variation in accordance with: RTT spikes are isolated [Acharya’96] –Failures: Deterministic failure for 10sec during session Application metric: gap between arrival of successive audio packets at the client

18 Recovery of application: Results SetupGap seen at application Failure of leg-2; with recovery2,963 ms Failure of leg-2; no recovery10,000 ms

19 Recovery of Application Session: CDF of gaps>100ms Recovery time: 822 ms (quicker than leg-2 due to buffer at text-to-audio service) Recovery time: 2963 ms Recovery time: 10,000 ms Jump at 350-400 ms: due to synch. text-to-audio processing (impl. artefact)

20 Discussion Recovery after failure of leg-2 –Breakup: 2,963 = 1,800 + O(700) + O(450) –1,800 ms: timeout to conclude failure –700 ms: signaling to setup alternate path –450 ms: recovery of application soft-state Re-processing current sentence Without recovery algo.: takes as long as failure duration O(3 sec) recovery –Can be completely masked with buffering –Interactive apps: still much better than without recovery Why is quick recovery possible? –Failure information does not have to propagate across network –Overlay network is a virtual-circuit based network

21 Evaluation: Scaling Scaling bottleneck: –Simultaneous recovery of all client sessions on a failed overlay link Setup: –20-node overlay network –5,000 service-level paths –Latency variation: same as earlier –Deterministic failure of 12 different links 12 data points on the graph –Metric: average time-to-recovery, of all paths failed

22 Average Time-to-Recovery vs. Instantaneous Load Why high variance? Load: 1,480 paths on failed link Avg. path recovery time: 614 ms

23 CDF of recovery times of all failed paths Flat regions: due to UDP-retransmit Emulator was losing packets Emulator limit: 20,000 pkts/sec Working on removing this bottleneck…

24 Perc. of paths above threshold recovery time

25 Scaling: Discussion Can recover at least 1,500 paths without hitting bottlenecks –How many client sessions per cluster-manager? Compute using #nodes, #edges in graph Translates to about 700 simultaneous client sessions per cluster-manager –In comparison, our text-to-speech implementation can support O(15) clients per machine –Minimal additional provisioning for cluster-manager

26 Time-to-recovery, with varying outage periods 85% of paths recovered within 1.5 sec 87 outages > 1.8 sec 67 outages > 5 sec 34 outages > 10 sec 23 outages > 25 sec 24,181 path recoveries 5,000 paths; 15 min run

27 Other Scaling Bottlenecks? Link-state floods: –Twice for each failure –For a 1,000-node graph Estimate #edges = 10,000 –Failures (>1.8 sec outage): O(once an hour) in the worst case –Only about 6 floods/second in the entire network! Graph computation: –O(k*E*log(N)) computation time; k = #services composed –For 6,510-node network, this takes 50ms –Huge overhead, but: path caching helps

28 Summary Service Composition: flexible service creation We address performance, availability, scalability Initial analysis: Failure detection -- meaningful to timeout in O(1.2-1.8 sec) Design: Overlay network of service clusters Evaluation: results so far –Good recovery time for real-time applications: O(3 sec) –Good scalability -- minimal additional provisioning for cluster managers Ongoing work: –Overlay topology issues: how many nodes, peering –Stability issues Feedback, Questions? Presentation made using VMWare Evaluation Analysis Design

29 References [OPES’00] A. Beck and et.al., “Example Services for Network Edge Proxies”, Internet Draft, draft-beck-opes-esfnep-01.txt, Nov 2000 [Labovitz’99] C. Labovitz, A. Ahuja, and F. Jahanian, “Experimental Study of Internet Stability and Wide-Area Network Failures”, Proc. Of FTCS’99 [Labovitz’00] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian, “Delayed Internet Routing Convergence”, Proc. SIGCOMM’00 [Acharya’96] A. Acharya and J. Saltz, “A Study of Internet Round- Trip Delay”, Technical Report CS-TR-3736, U. of Maryland [Yajnik’99] M. Yajnik, S. Moon, J. Kurose, and D. Towsley, “Measurement and Modeling of the Temporal Dependence in Packet Loss”, Proc. INFOCOM’99 [Balakrishnan’97] H. Balakrishnan, S. Seshan, M. Stemm, and R. H. Katz, “Analyzing Stability in Wide-Area Network Performance”, Proc. SIGMETRICS’97


Download ppt "Wide-Area Service Composition: Availability, Performance, and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley SAHARA Retreat, Jan 2002."

Similar presentations


Ads by Google