Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.

Similar presentations


Presentation on theme: "Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004."— Presentation transcript:

1 Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004

2 One-line Comments This paper is addressing the operator placement problem in distributed query processing by using network latency information

3 Contents Motivation Problem Solution Approach Central Version of Algorithm  Edge  Edge+  In-Network  latency Constrained Distributed Version of Algorithm Experiment Critique

4 Motivation Small scale query processing system: Not-scalable  A lot of data stream & query request Widely-distributed query processing

5 Problem Operator placement problem  Operators in query processing trees should be dispersed into the network O 00 O 10 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 00 Processing tree (query plan)IP network O 10 O 11 O 22 O 23 O 26 O 25 O 20 O 21 O 24 operatornode Application node

6 Problem : formalized version Operator placement problem  For efficient operator placement  Cost: Bandwidth O: operators A: their connected inputs & outputs V: nodes E: their links C(): link cost, bandwidth c(a)=0 if for a=(m,n) : Source (operator’s) locations are determined m n ac(a)

7 Solution Approach Network-aware operator placement algorithms  Edge Consider only sources and the proxy location  Edge+ Edge with pair-wise server communication latencies  In-Network Sources, proxy, a subset of all locations  Latency-bound algorithm

8 Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

9 Algorithm Design Principle Naïve algorithm for operator placement  Calculate all the combination of possible mapping  => Too complex Greedy algorithm  Calculate only for the locations of having high possibility  Locate operators in post-order  When we put a operator at a location, we can move by its children Processing tree O 00 O 10 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 IP network operatornode Application node S0S0 S1S1

10 Mapping Function O O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28

11 Edge Location candidate: sources, proxy Candidate with high possibility  (1) One of children’s locations  (2) A common location  (3) Proxy’s location Link cost

12 Edge (1) One of children’s locations  A location that maximizes the total tree cost between the operator and all of its children O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 S0S0 S1S1 O 10 O 20 O 22 O 21 305020 Processing tree

13 Edge (2) A common location Idea  Placing an operator and its children at a common location  -> zero overlay cost between the operator and its children Common location (cl)  Good place for all its children  -> an intersection of each child’s dl (the set of descendant leaf locations) O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 dl(O 11 )={S 0, S 1, S 2 } cl(O 00 )={S 0, S 1 }

14 Edge (3) Proxy’s location Idea  If tree costs are higher near the root  -> proxy location, r O 00 O 10 O 12 O 11 O 20 O 22 O 21 O 23 O 25 O 24 O 26 O 27 O 29 O 28 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1

15 Edge – Summary Summary

16 Edge+ Location candidate: sources, proxy Edge with network latency (d) between two locations Link cost Mapping function

17 In-Network Placement Location candidate : arbitrary locations (including sources and proxy) Overlay cost and mapping function is the same as Edge+ Problem: reducing the candidate location set

18 In-Network Placement Approach  Remove the location unless its distance to all current child placements is less than all pairwise distances between child placements O 00 O 10 O 12 O 11 O 12 O 10 O 00 40 30 20 50 60 30 N2N2 N4N4 N7N7 N8N8

19 Latency-Constrained Placement Find the configuration satisfying the latency-constrained Latency-constrained o cici O 20 O 22 O 21 S0S0 S0S0 S1S1 S0S0 S1S1 S1S1 S2S2 S0S0 S1S1 S1S1 P: a set of leaf-to-root paths cici O O 20 50 30 N4N4 N7N7 S1S1 O 22 O 21 S0S0 O 20 N5N5 If l=75

20 Contents Motivation Problem Solution Approach Central Version of Algorithm Distributed Version of Algorithm Experiment Critique

21 Distributed Query Placement Reason  Centralized approach – not scalable Substantial network state Algorithm complexity

22 Distributed Query Placement O1O1 C1C1 C2C2 C3C3 C4C4 O2O2 O3O3 O4O4 Processing tree Application proxy  Partition a processing tree into subtrees (zones)  Assign each zone to a coordinator node

23 Distributed Query Placement C1C1 C2C2 C3C3 C4C4 Tree Overlay

24 Experiment Experimental Setup  Processing Tree Binary tree Depth: 3 ~ 5  Network Topology Max pair-wise path delay: 500ms  Server and proxy location Uniform: APD = ASD Star: APD = 0.5*ASD Cluster: APD = 2*ASD APD: Average Proxy Distance ASD: Average Server Distance ServerProxy, Uniform Proxy, ClusterProxy, Star

25 Experiment Latency constraints  120ms (0.9nd, tight delay) vs. 300ms (2.2nd, loose delay) Direct comparison  Baseline case: all operators are located at the proxy Result Bandwidth consumptionLatency stretch

26 Critique Pros  Operator placement problem Focus on network-related cost not processing cost (BW, latency) Cons  High complexity algorithm possible to apply? Heavy processing Too much time taken to complete the placement  Latency information of many places is needed  Sequential convergence in a bottom-up manner => impossible to use in case of complex query plan & topology => more simple algorithm is appropriate  Dynamic? Unresilient to Dynamic topology change  In case of node leave, latency change


Download ppt "Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004."

Similar presentations


Ads by Google