Presentation is loading. Please wait.

Presentation is loading. Please wait.

Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani.

Similar presentations


Presentation on theme: "Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani."— Presentation transcript:

1 Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani

2 2 Performance Numbers Relative Contribution to Research 0 20 40 60 80 100 012345 Time in Program (years) Percent Contribution Student Advisor This Work

3 3 Future Directions (sample) Web services with monetary cost Web services with unstable response times (QoS guarantees?) Multiple web services for same data Caching web-service query results More expressive queries, also workflows Web service profiling and statistics-tracking

4 4 New Query Optimization Problem First Steps in Big Problem Our contribution

5 5 Web Services Standardized way of sharing data and functionality Data, Functionality Description and discovery WSDL,UDDI Users/ Clients SOAP Communication Web Services

6 6 Reuters Example Web Services WS 1 Stock symbol NASDAQ Company info WS 2 Stock symbol Stock activity

7 7 Querying Across Web Services WS 1 Stock symbol NASDAQ Company info WS 2 Stock symbol Stock activity Get info about all companies with high-activity stock User/ Client Query Results Reuters Easy Transparent Efficient Etc.

8 8 Same Basic Goal as Traditional DBMS Data Database Management System Query Results User/ Client Declarative Interface Easy Transparent Efficient Etc.

9 9 Web Service Management System Web Service Management System Query User/ Client Results WS 1 NASDAQ WS 2 Reuters Easy Transparent Efficient Etc.

10 10 WSMS Architecture Client WS 1 WS 2 WS n Query + input data Results Declarative Interface WS Invocations Metadata Component Web service registration Schema mapper Query Processing Component Plan execution Response- time profiler Statistics tracker Profiling and Statistics Component WSMS Plan selection

11 11 Running Example Credit card company wants to send offers to people with: a) credit rating > 600, and b) payment history = “good” on prior credit card Company has at its disposal: L : List of potential recipients (identified by SSN) WS 1 : SSN  credit rating WS 2 : SSN  cc number(s) WS 3 : cc number  payment history

12 12 Plan 1 Client WS 1 WS 2 WS 3 WSMS L(SSN) SSN  cr SSN  ccn ccn  ph Filter on cr, keep SSN SSN SSN,cr SSN,ccn SSN,ccn,ph Filter on ph, keep SSN Note: Pipelined processing SSNcr 1500 2700 SSNccn 2123 2456 ccnph 123bad 456good SSN 1 2 2 Query Plan

13 13 Simple Representation of Plan 1 L Results WS 1 WS 3 WS 2 SSN  crSSN  ccn ccn  ph

14 14 Plan 2 Client WS 1 WS 2 WS 3 WSMS L(SSN) SSN  cr SSN  ccn ccn  ph Filter on cr, keep SSN SSN SSN,cr SSN,ccn SSN,ccn,ph Filter on ph, keep SSN SSNcr 1500 2700 SSNccn 2123 2456 ccnph 123bad 456good SSN 1 2 2 Join SSN

15 15 Simple Representation of Plan 2 LResults WS 1 WS 2 WS 3 SSN  cr SSN  ccnccn  ph

16 16 Quiz LResults WS 1 WS 2 WS 3 L Results WS 1 WS 3 WS 2 Which plan is better? Plan 2 Plan 1 Cost metric: steady-state throughput Assume join is “free” Plan 1 is never worse

17 17 Query Optimization Primer Possible query plans: P 1, …, P n Data/access statistics: S Execution cost metric: cost(P i, S) GOAL: Find least-cost plan

18 18 Query Optimization Primer Possible query plans: P 1, …, P n Data/access statistics: S Execution cost metric: cost(P i, S) GOAL: Find least-cost plan

19 19 Queries and Plans “Select-Project-Join” queries over input data L and set of web services WS 1, …, WS n Precedence constraints Output of WS i may be needed as input for WS j Ex: WS 2 : SSN  ccn and WS 3 : ccn  ph Precedence DAG defines space of query plans

20 20 Query Optimization Primer Possible query plans: P 1, …, P n Data/access statistics: S Execution cost metric: cost(P i, S) GOAL: Find least-cost plan

21 21 Statistics 1)Web service response times 2)Web service selectivities Our contribution New Query Optimization Problem

22 22 Statistics: Response Times r i : per-tuple response time of WS i from client WS 1 SSN  cr Client SSN cr Assume independent response times within query plans r1r1 r i ≈ 1/throughput, can be reduced by batching, parallel calls batching (see paper) Our contribution New Query Optimization Problem

23 23 Statistics: Selectivities s i : selectivity of WS i Average # output tuples per input tuple to WS i including post-filtering in query plan WS 1 : SSN  cr, filter cr > 600 If 90% of SSNs have cr > 600 then s 1 = 0.9 WS 2 : SSN  ccn If on average each SSN has 2 credit cards then s 2 = 2.0 Our contribution Assume independent selectivities within query plans New Query Optimization Problem

24 24 Query Optimization Primer Possible query plans: P 1, …, P n Data/access statistics: S Execution cost metric: cost(P i, S) GOAL: Find least-cost plan

25 25 Bottleneck Cost Metric Our contribution New Query Optimization Problem

26 26 Bottleneck Cost Metric Conference Lunch Buffet Dish 1Dish 2Dish 3Dish 4 Average per-tuple processing time = response time of slowest (bottleneck) stage in pipeline Note: selectivities=1 in this example

27 27 Cost Equation for Plan P R i (P): Predecessors of WS i in plan P Fraction of input tuples seen by WS i = WS i response time per input tuple = (assumes WSMS processing is not the bottleneck) Bottleneck cost metric: Π j ∈ R i (P) s j (Π j ∈ R i (P) s j ) r i cost(P) = max 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

28 28 Contrast with Sum Cost Metric Dish 1Dish 2Dish 3Dish 4 Stream filter ordering Expensive predicate placement “Polite” Lunch Buffet cost(P) = ∑ 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

29 29 Problem Statement Input: Web services WS 1, …, WS n Response times r 1, …, r n Selectivities s 1, …, s n Precedence constraints among web services Output: Web services arranged into a plan P P respects all precedence constraints cost(P) is minimized

30 30 No Precedence Constraints All selectivities ≤ 1 Theorem: Optimal to order linearly by r i (selectivities irrelevant) General case (optimal): … join at WSMS “selective” web services ordered by response-time “proliferative” web services Results

31 31 With Precedence Constraints cost(P) = max 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

32 32 With Precedence Constraints Sum cost metric Hard to even obtain a factor O(n  ) of optimal 0 20 40 60 80 100 012345 Time in Program (years) Percent Contribution Student Advisor cost(P) = ∑ 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

33 33 With Precedence Constraints Bottleneck (max) cost metric Surprisingly, optimal solution in polynomial time O(n 5 ) algorithm in paper – Add one WS at a time to the plan – WS chosen by solving a linear program 0 20 40 60 80 100 012345 Time in Program (years) Percent Contribution Student Advisor cost(P) = max 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

34 34 Example Revisited LResults WS 1 WS 2 WS 3 L Results WS 1 WS 3 WS 2 Plan 2 Plan 1 SSN  cr SSN  ccn ccn  ph WS 2 WS 1 WS 3 WS 1 WS 2 WS 3 Selective Proliferative WS 2 WS 3 Precedence constraint max 1 ≤i≤n ( ( Π j ∈ R i (P) s j ) r i )

35 35 Implementation Built prototype WSMS query processor Optimizer and execution engine Assumes schema issues resolved, statistics provided Written in Java and uses Apache Axis (open-source SOAP implementation) Experiments (see paper) validate analytical results

36 36 Isn’t Problem the Same as … ? Web Service composition Targeted for workflow-oriented applications No provably optimal strategies Parallel/distributed query optimization Freedom to place query operators Much larger space of execution plans Data integration, mediators For general sources of data Optimization of total resource consumption

37 37 Future Directions (sample) Web services with monetary cost Web services with unstable response times (QoS guarantees?) Multiple web services for same data Caching web-service query results More expressive queries, also workflows Web service profiling and statistics-tracking

38 38 Conclusion Our contribution New Query Optimization Problem

39 39 Conclusion Our contribution New Query Optimization Problem

40 40 Questions? 0 20 40 60 80 100 012345 Time in Program (years) Percent Contribution Student Advisor


Download ppt "Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani."

Similar presentations


Ads by Google