Distributed Database Management Systems Lecture 32
In the previous lecture Query Processing Query Decomposition Its Different Phases.
In this Lecture Final phase of QD Next phase of Query Optimization: Data Localization.
A1, ….,An(p(Ap)(R)) ((p(Ap) A1, ….,An, Ap(R)))- 3- Idempotency of unary Ops i) A’(A”(R)) A’(R) ii) σp1(A1)(σp2(A2)(R)) σp1(A1) ∧ p2(A2)(R)- 4- Commuting selection with projection A1, ….,An(p(Ap)(R)) ((p(Ap) A1, ….,An, Ap(R)))-
5- Commuting Selection with binary ops, like join and CP 6- Commuting Projection with binary ops, like join and CP
Many equivalence query trees can be generated Comparing all such trees to select best is not feasible Heuristic is applied
Separation of Unary Ops Unary ops on the same relation grouped together Unary ops commuted with binary ops Binary ops are ordered
ASN PROJ EMP x ⋈ pNo^eNo (pName = ‘CAD/CAM’)^ (dur = 12 v dur = 24)^ eName ’Saleem’ eName
PROJ ASG EMP pNo’ pNo, eNo eNo, eName pNo, eName eName pName = ‘CAD/CAM’ dur=12 v dur = 24 eName != ‘Saleem’ pNo’ pNo, eNo eNo, eName pNo, eName eName
This concludes Query Decomposition and Restructuring Concerns both centralized and distributed environments
Now we move to the second phase of Query Optimization; Data Localization of DD QD at global level, this phase transform into local ones (fragments)
Called Localization Program A Naïve rule… However, it won’t be an efficient one
Reduction During Data Localization
Example Schema EMP(eNo, eName, title) Horizontal Fragmentation EMP1 = eNo ≤ ‘E3’ (EMP) EMP2 = ’E3’<eNo ≤ ‘E6’ (EMP) EMP3 = eNo > ‘E6’ (EMP)
Reduction with Selection Rule 1: pi (Rj) = Ø if ∀x in Rj: (pi(x) ^ pj(x)) That is, there exist conflicting predicates
Select * from EMP where eNo = ‘E7’ U eNo = ‘E7’ EMP3 eNo = ‘E7’ Smart thinking Naïve Rule
Reduction on Join Distributing joins over unions and avoiding unnecessary joins (R1UR2) ⋈ R3= (R1 ⋈ R3) U (R2 ⋈ R3)
Rule2: Ri⋈Rj = Ø if ∀x in Ri and ∀y in Rj:(pi(x) ^ pj(x)) Useless joins can be determined viewing the join predicates
Remember! Reduced query is not always better. We have to be watchful- Parallel Execution ASG1 = eNo ≤ ‘E3’ (ASG) ASG2 = ’eNo > ‘E3’ (ASG).
Select eName From EMP, ASG Where EMP.eNo = ASG. eNo.
We already know about PHF of EMP ⋈eNo ASG1 ASG2 Generic Query
EMP1 U ⋈eNo ASG1 EMP2 ASG2 EMP3 Reduction for PHF with JOIN
Reduction for VF Relation fragmented on projection, with PK as the common attribute Localization involves natural join on PK
EMP1 = eNo, eName (EMP) EMP2 = eNo, title (EMP) Relation R defined over attributes A = {A1, ..., An} vertically fragmented as Ri = A' (R) where A' A
Rule3: D,K(Ri) is useless if the set of projection attributes D is not in A‘.
Example: Select eName from EMP ⋈eNo Generic Query Reduced Query
Reduction for DF Relation R is fragmented based on the predicate on S DF should be done for hierarchical relationship between R and S-
Example ASG1: ASG ⋉ ENO EMP1 ASG2: ASG ⋉ ENO EMP2 EMP1: σ title= ‘Programmer’ (EMP) EMP2: σ title “Programmer’ (EMP).
Query SELECT * FROM EMP, ASG WHERE ASG.eNo = EMP.eNo AND EMP.title = "Mech. Eng."
ASG1 ⋈eNo U ASG2 EMP1 EMP2 title = ‘Mech Eng.’ Generic Query
Pushing Selection Down ASG1 ⋈eNo U ASG2 EMP2 title = ‘Mech Eng.’ Pushing Selection Down ⋈eNo
ASG1 ⋈eNo U EMP2 ASG2 title = ‘Mech Eng.’ Union Moved Up
⋈eNo ASG2 EMP2 title = ‘Mech Eng.’ Optimal Reduced Query
Thanks