Presentation is loading. Please wait.

Presentation is loading. Please wait.

Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN

Similar presentations


Presentation on theme: "Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN"— Presentation transcript:

1 Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN
GeoLoc: Robust Resource Allocation Method for Query Optimization in Data Grid Systems Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN Baltic DB&IS'2012

2 Table of contents Introduction Existing methods classification
Contributions Allocation Space Allocation Algorithm Performance Evaluation Conclusion

3 Introduction Data Grid Heterogeneity Dynamicity Large Scale

4 Introduction Query processing Query execution Parsing Query rewrite
Resource allocation Resource discovery

5 Introduction Problem Input: Set of query operations (dependent)
Set of nodes Distribution of Relations Dynamic and Static characteristics of Data Grid Objectives: Select optimal subset of nodes to allocate resources for query operations

6 Existing Methods Classification
Control structure: Centralized Hierarchical Decentralized

7 Existing Methods Classification
Algorithms: Heuristic Exact

8 Existing Methods Classification
Strategies: Static Resource Allocation Execution Dynamic Resource Allocation Execution Hybrid Execution with Dynamic Reallocation Resource Allocation

9 Existing Methods Classification
Cooperation type: Classic Incentive-based Economic / Reputation

10 Contributions Allocation Space Restriction
Algorithm of Resource Allocation Parallelism: pipeline, intra-operation, inter-operation Distributed and duplicated relations

11 Allocation Space Source nodes Nearest nodes

12 Allocation Algorithm Each relation is distributed by N equal parts
Assumptions Each relation is distributed by N equal parts Hybrid Hash Join algorithm Results are being retransferred from the nodes Memory is using for reducing I/O operations

13 Overall Node Bandwidth
Allocation Algorithm Stage 1. Definition of Allocation Space Input: All nodes with fragments of queried relations (1) All nodes nearest to (1) CPU NET I/O Overall Node Bandwidth Algorithm: Selection of source nodes on the base of their performance Placement of Scan operations Generation of Allocation Space (source nodes + nearest nodes)

14 Allocation Algorithm Stage 2. Generation of execution plan Algorithm:
Input: Query logic plan Generated Allocation Space Idea: Parity in bandwidth between Scan and Join operations Algorithm: BEGIN FOR each join DO Count the time of source relations read and transferring, Tscan_exec DO Choose the most efficient node Neff from a set of AS for placing join operation Add Neff to the join allocation plan, Pjoin Estimate the execution time of join, Tjoin_exec WHILE (Tjoin_exec > Tscan_exec) Add Pjoin to the query allocation plan, Pquery ENDFOR END

15 Allocation Algorithm Query: R S R = R1 U R2 S = S1 U S2 R1: n1, n2
Example Query: R S R = R1 U R2 S = S1 U S2 R1: n1, n2 R2: n3, n4 S1: n5, n6 S2: n7, n8 n5 n2 n8 n6 n1 n3 n7 n4

16 Allocation Algorithm Query: R S R = R1 U R2 S = S1 U S2 R1: n1, n2
Example Query: R S R = R1 U R2 S = S1 U S2 R1: n1, n2 R2: n3, n4 S1: n5, n6 S2: n7, n8 n5 n2 n8 n6 n1 n3 n7 n4

17 Allocation Algorithm Query: R S R = R1 U R2 S = S1 U S2 Example
Allocation space n1, n4, n6, n7, n10 n11, n12, n13, n14 n15, n16, n17, n18 n19, n20, n21, n22 n23, n24, n25, n26 n5 n2 n8 n21 n22 n10 n11 n6 n1 n3 n23 n12 n20 n14 n24 n13 n15 n16 n7 n25 n4 n17 n19 n26 n18

18 Allocation Algorithm Query: R S R = R1 U R2 S = S1 U S2 Example
Allocation space n1, n4, n6, n7, n10 n11, n12, n13, n14 n15, n16, n17, n18 n19, n20, n21, n22 n23, n24, n25, n26 n5 n2 n8 n21 n22 n10 n11 n6 n1 n3 n23 n12 n20 n14 n24 n13 n15 n16 n7 n25 n4 n17 n19 n26 n18

19 Allocation Algorithm Example Source Nodes Allocation space
n1, n4, n6, n7, n10 n11, n12, n13, n14 n15, n16, n17, n18 n19, n20, n21, n22 n23, n24, n25, n26 Resulted Execution Plan Scans: n1, n4, n7, n6 Joins: n18, n25, n10, n26, n13, n12, n19 n1 n4 n7 n6 Nodes’ Bandwidth: 2000 lines/sec Nodes allocated for Join n18 n25 n10 n26 n13 n12 n19 Nodes’ Bandwidth: 1790 lines/sec 1920 lines/sec 1650 lines/sec 2000 lines/sec 1500 lines/sec 1300 lines/sec 900 lines/sec

20 Performance Evaluation
Experimental conditions Data Grid simulator 6000 heterogeneous nodes Simple, Average and Complex queries Distributed and duplicated relations Comparison Method GeoLoc Method Gounaris2004

21 Performance Evaluation
Optimization Time

22 Performance Evaluation
Response Time

23 Conclusion Proposed method is: Efficient Scalable
Adapted to heterogeneous decentralized Data Grid Perspective: Adaptation to the Dynamicity of Data Grid

24 Thank you for your attention!


Download ppt "Igor EPIMAKHOV Abdelkader HAMEURLAIN Franck MORVAN"

Similar presentations


Ads by Google