Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Database Management Systems

Similar presentations


Presentation on theme: "Distributed Database Management Systems"— Presentation transcript:

1 Distributed Database Management Systems
Lecture 20

2 In the Previous Lecture
Continued with VF Computed CA Partitioning Algorithm

3 In this Lecture Continue with VF Hybrid Fragmentation
Allocation Problem Replication

4 A1 A3 A2 A4 45 53 5 3 80 75 78 A1 A2 A3 A4 q1 1 q2 q3 q4 S1 S2 S3 q1 15 20 10 q2 5 q3 25 q4 3 CA refj(qi) accj(qi) z2 = 3311 z1 = 0 – 452 z3=

5 A1= jNo A2= jName A3= budget A4= loc V1 = {jNo, budget} V2 = {jNo, jName, loc}

6 VF- Two Problems 1- Clusters not in the sides, rather in the middle of CA 2- m-way partitioning

7 VF Correctness

8 A relation R, defined over attribute set A and key K, generates the vertical partitioning
FR = {R1, R2 , …, Rr } Completeness: The following should be true for A A =U Ri

9 Reconstruction: can be achieved by
R = ⋈K Ri, ∀Ri ∈ FR Disjointness: TID's are not considered to be overlapping since they are maintained by the system PK is exception

10 Hybrid Fragmentation

11 Practically, applications require the fragmentation of both the types to be combined

12 So the nesting of fragmentations, i. e
So the nesting of fragmentations, i.e., one following the other, it becomes sort of a tree

13 Disjoint ness and completeness have to be assured at each step, and reconstruction can be obtained by applying Join and Union in reverse order

14 CUST Beta Delta1 Delta2 A/C# Name Bal Branch AB101 Saeed 4535 MTN
Laeeq LHR AB203 Salma AB109 Shaan 45.32 CUST Beta = ΠA/C#, Bal (CUST) Delta1 = σ Loc = “MTN” (ΠA/C#, Name, Branch (CUST)) Delta2 = σ Loc = “LHR” (ΠA/C#, Name, Branch (CUST)) Beta A/C# Bal AB101 4535 AB202 AB203 AB109 45.32 Delta1 Delta2 A/C# Name Branch AB101 Saeed MTN AB109 Shaan A/C# Name Branch AB202 Laeeq LHR AB203 Salma

15 Allocation

16 Find the "optimal" distribution of F to S.
Given F = {F1, F2 , …, Fn} fragments S ={S1 , S2 , …, Sm} network sites Q = {q1, q2 ,…, qq } applications Find the "optimal" distribution of F to S.

17 Optimality Minimize the processing cost and maximize the system throughput at each site

18 It is a complex problem to be solved mathematically, to make the things very simple, consider the allocation of a single fragment Fk,

19 set of read only queries on Fk from Si; T = {t1, t2, …, tm}
set of update queries U on Fk from Si; U= {u1, u2, .., um}

20 Communication Cost C(T) = {c1,2, c1,3, …., c1,m, ….cm-1, m} C’(T) = {c’1,2, c’1,3, …., c’1,m, ….c’m-1, m} Storage Cost D = {d1, d2, ……., dm}

21 Allocation problem is to find the cites out of set of sites S, where the copy of Fk will be stored.

22 The specification of the allocation problem will be
0 otherwise xj = 1 if the fragment Fk is assigned to site Sj The specification of the allocation problem will be min

23 That concludes our discussion on Fragmentation
Lets summarize it

24 Fragmentation is splitting a table into smaller tables Alternatives
Horizontal Vertical Hybrid

25 Horizontal Fragmentation

26 Splits a table into horizontal subsets (row wise)
Primary and Derived Horizontal Fragmentation

27 We need major simple predicates (Pr); should be complete & minimal
Pr is transformed into Pr’ Minterm (M) predicates from Pr’

28 Correctness of PHF depends on the Pr’
Derived Horizontal Fragmentation is based on Owner-member link

29 Vertical Fragmentation is more complicated due to more options
Based on attributes’ affinities

30 AA is transformed into CA using BEA
Calculated using usage data and access frequencies from different sites AA is transformed into CA using BEA

31 CA establishes clusters of attributes that are split for efficient access
Hybrid Fragmentation combines HF and VF That concludes Fragmentation

32 Thanks


Download ppt "Distributed Database Management Systems"

Similar presentations


Ads by Google