Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hotspot Detection in a Service Oriented Architecture Pranay Anchuri,

Similar presentations


Presentation on theme: "Hotspot Detection in a Service Oriented Architecture Pranay Anchuri,"— Presentation transcript:

1 Hotspot Detection in a Service Oriented Architecture Pranay Anchuri, anchupa@cs.rpi.edu, http://cs.rpi.edu/~anchupaanchupa@cs.rpi.edu http://cs.rpi.edu/~anchupa Rensselaer Polytechnic Institute, Troy, NY Roshan Sumbaly, roshan@coursera.orgroshan@coursera.org Coursera, Mountain View, CA Sam Shah, samshah@linkedin.comsamshah@linkedin.com LinkedIn, Mountain View, CA

2 www.rpi.edu Introduction

3 www.rpi.edu  Largest professional network.  300M members from 200 countries.  2 new members per second.

4 www.rpi.edu  Largest professional network.  300M members from 200 countries.  2 new members per second.

5 www.rpi.edu Service Oriented Architecture

6 www.rpi.edu What is a Hotspot  Hotspot : Service responsible for suboptimal performance of a user facing functionality.

7 www.rpi.edu What is a Hotspot  Hotspot : Service responsible for suboptimal performance of a user facing functionality.  Performance measures:  Latency  Cost to serve  Error rate

8 www.rpi.edu Who uses hotspot detection ?  Engineering teams :  Minimize latency for the user.  Increase the throughput of the servers.  Operations teams :  Reduce the cost of serving user requests.

9 www.rpi.edu Goal

10 www.rpi.edu Data - Service Call Graphs  Service call metrics logged into a central system.  Call graph structure re-constructed from random trace id.

11 www.rpi.edu Example of Service Call Graph Read profile Content Service Context Service Content Service EntitlementsVisibility 3 7 12 10 11

12 www.rpi.edu Example of Service Call Graph Read profile Content Service Context Service Content Service EntitlementsVisibility 3 7 12 10 11

13 www.rpi.edu Example of Service Call Graph Read profile Content Service Context Service Content Service EntitlementsVisibility 3 7 12 10 11

14 www.rpi.edu Example of Service Call Graph Read profile Content Service Context Service Content Service EntitlementsVisibility 3 7 12 10 11

15 www.rpi.edu

16 Challenges in mining hotspots

17 www.rpi.edu Structure of call graphs  Structure of call graphs change rapidly across requests.  Depends on member’s attributes.  A/B testing.  Changes to code base.  Over 90% unique structures for most requested services.

18 www.rpi.edu Asynchronous service calls  Calls A  B, A  C are  Serial : C is called after B returns to A.  Parallel : B and C are called at same time or in a brief time span.  Parallel service calls are particularly difficult to handle.  Degree of parallelism ~ 20 for some services.

19 www.rpi.edu Related Work  Hu et. al [SIGCOMM 04, INFOCOMM 05]  Tools to detect bottlenecks along network paths.  Mann et. al [USENIX 11]  Models to estimate latency as a function of RPC’s latencies.

20 www.rpi.edu Why existing methods don’t work ?  Metric cannot be controlled as in bottleneck detection algorithms.  Analyzing millions of small networks.  Parallel service calls.

21 www.rpi.edu Our approach

22 www.rpi.edu ● Given call graphs Optimize and summarize approach

23 www.rpi.edu ● Given call graphs ● Hotspots in each call graph Optimize and summarize approach

24 www.rpi.edu ● Given call graphs ● Hotspots in each call graph ● Ranking hotspots Optimize and summarize approach

25 www.rpi.edu What are the top-k hotspots in a call graph ?  Hotspots in a specific call graph irrespective of other call graphs for the same type of request.

26 www.rpi.edu Key Idea What are the k services, if already optimized, that would have lead to maximum reduction in the latency of request ? (Specific to a particular call graph)

27 www.rpi.edu Quantifying impact of a service  What if a service was optimized by θ ? (think after the fact)

28 www.rpi.edu Quantifying impact of a service  What if a service was optimized by θ ? (think after the fact) Its internal computations are θ times faster. No effect on the overall latency if its parent is waiting on other service call to return.

29 www.rpi.edu Example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8]

30 www.rpi.edu Example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8]

31 www.rpi.edu Example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8] 2x faster

32 www.rpi.edu Example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8] 2x faster

33 www.rpi.edu Example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8] 2x faster Effect of 2x speedup

34 www.rpi.edu Local effect of optimization

35 www.rpi.edu Negative example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8]

36 www.rpi.edu Negative example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8]

37 www.rpi.edu Negative example [0,11] [0,3] [1,2] [1.3, 1.6] [2.1, 2.5] [4,11] [6,9] [7,8]

38 www.rpi.edu Effect propagation ABC  Optimizing C  Reduces run time of C  C returns to B earlier.  B might return earlier.  A might return earlier…

39 www.rpi.edu Propagation Assumption A service propagates the effects to its parent only if doing so doesn’t change the order of service calls (by parent).

40 www.rpi.edu Example

41 www.rpi.edu Example After optimization

42 www.rpi.edu Example

43 www.rpi.edu Under the propagation assumption

44 www.rpi.edu Relaxation  Variation of the propagation assumption that allows for a service to propagate fractional effects to its parent.  Leads to a greedy algorithm.

45 www.rpi.edu Greedy algorithm to compute top-k hotspots  Given an optimization factor θ,  Repeatedly select a service that has maximum impact on frontend service.  Update the times after each selection.  Stop after k iterations.

46 www.rpi.edu Ranking hotspots

47 www.rpi.edu Rest of the paper  Similar approach applied to cost of request metric.  Generalized framework for optimizing arbitrary metrics.  Other ranking schemes.

48 www.rpi.edu Results

49 www.rpi.edu Dataset Request type Avg # of call graphs per day* Avg # of service call per request Avg # of subcalls per service Max # of parallel subcalls Home10.2 M16.901.889.02 Mailbox3.33 M23.311.98.88 Profile3.14 M17.311.8611.04 Feed1.75 M16.291.878.97 * Scaled down by a constant factor

50 www.rpi.edu vs Baseline algorithm

51 www.rpi.edu User of the system

52 www.rpi.edu Impact of improvement factor

53 www.rpi.edu Consistency over a time period

54 www.rpi.edu Conclusion

55 www.rpi.edu Conclusions  Defined hotspots in service oriented architectures.  Framework to mine hotspots w.r.t various performance metrics.  Experiments on real world large scale datasets.

56 www.rpi.edu Thanks Questions ?


Download ppt "Hotspot Detection in a Service Oriented Architecture Pranay Anchuri,"

Similar presentations


Ads by Google