PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.

Slides:



Advertisements
Similar presentations
Outline  Introduction  Background  Distributed DBMS Architecture  Distributed Database Design  Semantic Data Control ➠ View Management ➠ Data Security.
Advertisements

Distributed DBMSPage 6. 1© 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database Design.
Distributed Query Processing –An Overview
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Relational Model The main reference of this presentation is the textbook and PPT from : Elmasri & Navathe, Fundamental of Database Systems, 4 th edition,
Relational Algebra Indra Budi Fakultas Ilmu Komputer UI 2 n Basic Relational Operations: l Unary Operations  SELECT   PROJECT 
Enterprise Systems Distributed databases and systems - DT
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Chapter 13 (Web): Distributed Databases
1 File Processing n Data are stored in files with interface between programs and files. n Various access methods exist (e.g., Sequential, indexed, random)
Distributed Databases: Review May 2003Yangjun Chen1 Distributed Databases System Architecture Distributed Database Design Semantic Data Control Distributed.
CS 347Notes 021 CS 347: Parallel and Distributed Data Management Notes02: Distributed DB Design Hector Garcia-Molina.
Distributed DBMS© 2001 M. Tamer Özsu & Patrick Valduriez Page 1.1 Outline  Introduction à What is a distributed DBMS à Problems à Current state-of-affairs.
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database.
Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
1 Distributed Databases Review CS347 June 6, 2001.
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Distributed DBMS© 2001 M. Tamer Özsu & Patrick Valduriez Page 1.1 Outline  Introduction à What is a distributed DBMS à Problems à Current state-of-affairs.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
1 Distributed Databases CS347 Lecture 13 May 23, 2001.
Distributed databases
Distributed Databases
DISTRIBUTED DBMS ARCHITECTURE
PMIT-6103 Advanced Database Systems
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
DISTRIBUTED DATABASE DESIGN
Session-9 Data Management for Decision Support
Query Optimization. Query Optimization Query Optimization The execution cost is expressed as weighted combination of I/O, CPU and communication cost.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed DBMS© 2001 M. Tamer Özsu & Patrick Valduriez Page 1.1 Outline  Introduction à What is a distributed DBMS à Problems à Current state-of-affairs.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
DDBMS Distributed Database Management Systems Fragmentation
Databases Illuminated
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Design Process - Where are we?
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li.
PMIT-6102 Advanced Database Systems
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Chapter 17: Additional Slides February 6, Outline Physical Data Management  Fragments  Distributed Query Processing  Transactions Logical Data.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
CS742 – Distributed & Parallel DBMSPage 2. 1M. Tamer Özsu Outline Introduction & architectural issues  Data distribution  Fragmentation  Data Allocation.
CS742 – Distributed & Parallel DBMSPage 3. 1M. Tamer Özsu Outline Introduction & architectural issues Data distribution  Distributed query processing.
Physical Database Design for Relational Databases Step 3 – Step 8
Chapter 19: Distributed Databases
Outline Introduction Background Distributed DBMS Architecture
Distributed Database Management Systems
Vertical Fragmentation
Distributed Database Management Systems
Distributed Database Management Systems
Distributed Database Design
Outline Introduction Background Distributed DBMS Architecture
Outline Introduction Background Distributed DBMS Architecture
Presentation transcript:

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

Class Test -02 Solution It is not guaranteed that all the solutions are correct.

Tutorial Q 1. Briefly describe the correctness of Fragmentation. 2. What is the basic difference between Primary horizontal fragmentation and Derived horizontal fragmentation with an example. 3. Write an iterative algorithm that would generate a complete and minimal set of predicates Pr ’ from a given a set of simple predicates Pr. 4. Show the reconstruction of Hybrid Fragmentation. 5. Describe Allocation Model. 6. Draw the Generic Layering Scheme for Distributed Query Processing. Slide 3

Site 1Site 2Site 3Site 4Site 5 EMP 1 =  ENO≤“E3” (EMP)EMP 2 =  ENO>“E3” (EMP) ASG 2 =  ENO>“E3” (ASG) ASG 1 =  ENO≤“E3” (ASG) Result Site 5 Site 1Site 2Site 3Site 4 ASG 1 EMP 1 EMP 2 ASG 2 result 2 =(EMP 1   EMP 2 ) ENO  DUR>37 (ASG 1  ASG 1 ) Site 4 result = EMP 1 ’  EMP 2 ’ Site 3 Site 1Site 2 EMP 2 ’ =EMP 2 ENO ASG 2 ’ EMP 1 ’ =EMP 1 ENO ASG 1 ’ ASG 1 ’ =  DUR>37 (ASG 1 )ASG 2 ’ =  DUR>37 (ASG 2 ) Site 5 ASG 2 ’ ASG 1 ’ EMP 1 ’ EMP 2 ’ 7. We assume that relations EMP and ASG are horizontally fragmented. Fragments ASG1, ASG2, EMP1, and EMP2 are stored at sites 1, 2, 3, and 4,respectively, and the result is expected at site 5. Two strategy are given bellow: Which strategy is better? (a) Strategy A (b) Strategy B Slide 4 Tutorial Q

1. Briefly describe the correctness rules of Fragmentation. Correctness rules of Fragmentation There are the following three rules during fragmentation, which, together, ensure that the database does not undergo semantic change during fragmentation. Completeness  Decomposition of relation R into fragments R 1, R 2,..., R n is complete if and only if each data item in R can also be found in some R i  This property, which is identical to the lossless decomposition property of normalization  it ensures that the data in a global relation are mapped into fragments without any loss Slide 5 Answer 01:

Correctness of Fragmentation Reconstruction  If relation R is decomposed into fragments F R ={R 1, R 2,..., R n }, then there should exist some relational operator ∇ such that R = ∇ Ri,  The reconstructability of the relation from its fragments ensures that constraints defined on the data in the form of dependencies are preserved. Slide 6

Correctness of Fragmentation Disjointness  If relation R is horizontally decomposed into fragments F R ={R 1, R 2,..., R n }, and data item d j is in R j, then d j should not be in any other fragment R k (k ≠ j ).  This criterion ensures that the horizontal fragments are disjoint.  If relation R is vertically decomposed, its primary key attributes are typically repeated in all its fragments (for reconstruction).  Therefore, in case of vertical partitioning, disjointness is defined only on the non-primary key attributes of a relation. Slide 7

What is the basic difference between Primary horizontal fragmentation and Derived horizontal fragmentation. Give an example both of them. Answer 02: Primary horizontal fragmentation of a relation is performed using predicates that are defined on that relation. Derived horizontal fragmentation is the partitioning of a relation results from predicates being defined on another relation. Slide 8

Primary Horizontal Fragmentation PROJ1 =  LOC=“Montreal” (PROJ) PROJ2 =  LOC=“New York” (PROJ) PROJ3 =  LOC=“Paris” (PROJ) Slide 9 Primary Horizontal Fragmentation of Relation PROJ

EMP 1 = EMP ⋉ SKILL 1 EMP 2 = EMP ⋉ SKILL 2 where SKILL 1 =  SAL≤30000 (SKILL) SKILL 2 =  SAL>30000 (SKILL) ENOENAMETITLE E3A. LeeMech. Eng. E4J. MillerProgrammer E7R. DavisMech. Eng. EMP 1 ENOENAMETITLE E1J. DoeElect. Eng. E2M. SmithSyst. Anal. E5B. CaseySyst. Anal. EMP 2 E6L. ChuElect. Eng. E8J. JonesSyst. Anal. Slide 10 Derived Horizontal Fragmentation

Answer 03: COM_MIN Algorithm Given:a relation R and a set of simple predicates Pr Output:a complete and minimal set of simple predicates Pr' for Pr Rule 1:a relation or fragment is partitioned into at least two parts which are accessed differently by at least one application. Slide 11 Write an iterative algorithm that would generate a complete and minimal set of predicates Pr ’ from a given a set of simple predicates Pr.

COM_MIN Algorithm  Initialization :  find a p i  Pr such that p i partitions R according to Rule 1  set Pr' = p i ; Pr  Pr – {p i } ; F  {f i }  Iteratively add predicates to Pr' until it is complete  find a p j  Pr such that p j partitions some f k defined according to minterm predicate over Pr' according to Rule 1  set Pr' = Pr'  {p j }; Pr  Pr – {p j }; F  F  {f j }  if  p k  Pr' which is nonrelevant then Pr'  Pr' – {p k } F  F – {f k } Slide 12

To reconstruct the original global relation in case of hybrid fragmentation, one starts at the leaves of the partitioning tree and moves upward by performing joins and unions. Slide 13 Show the reconstruction of Hybrid Fragmentation. Answer 04: Reconstruction of Hybrid Fragmentation

General Form min(Total Cost) subject to response time constraint storage constraint processing constraint Decision Variable Describe Allocation Model. Answer 05: Allocation Model x ij  1 if fragment F i is stored at site S j 0 otherwise Slide 14

Total Cost Storage Cost (of fragment F j at S k ) We choose a different approach in our model of the database allocation problem (DAP) and specify it as consisting of the processing cost (PC) and the transmission cost (TC). Thus the query processing cost (QPC) for application qi is: processing component + transmission component Allocation Model (unit storage cost at S k )  (size of F j )  x jk query processing cost  all queries  cost of storing a fragment at a site all fragments  all sites  Slide 15

Allocation Model Query Processing Cost  Processing component PC, consists of three cost factors  the access cost (AC) + the integrity enforcement cost (IE) + the concurrency control cost (CC)  Access cost o The first two terms calculate the number of accesses of user query qi to fragment Fj. o We assume that the local costs of processing them are identical. o The summation gives the total number of accesses for all the fragments referenced by qi. Multiplication by LPC k gives the cost of this access at site S k. o We again use x ij to select only those cost values for the sites where fragments are stored.  Integrity enforcement and concurrency control costs o Can be similarly calculated (no. of update accesses+ no. of read accesses)  all fragments  all sites  x ij  local processing cost at a site Slide 16

Query Processing Cost Transmission component cost of processing updates + cost of processing retrievals  In update queries it is necessary to inform all the sites where replicas exist, while in retrieval queries, it is sufficient to access only one of the copies.  In addition, at the end of an update request, there is no data transmission back to the originating site other than a confirmation message, whereas the retrieval-only queries may result in significant data transmission.  Cost of updates  Retrieval Cost Allocation Model update message cost  all fragments  all sites  acknowledgment cost all fragments  all sites  min all sites all fragments  (cost of retrieval command  cost of sending back the result) Slide 17

Allocation Model Constraints  Response Time execution time of query ≤ max. allowable response time for that query  Storage Constraint (for a site)  Processing constraint (for a site) storage requirement of a fragment at that site  all fragments  storage capacity at that site processing load of a query at that site  all queries  processing capacity of that site Slide 18

Calculus Query on Distributed Relations CONTROL SITE LOCAL SITES Query Decomposition Query Decomposition Data Localization Data Localization Algebraic Query on Distributed Relations Global Optimization Global Optimization Fragment Query Local Optimization Local Optimization Optimized Fragment Query with Communication Operations Optimized Local Queries GLOBAL SCHEMA GLOBAL SCHEMA FRAGMENT SCHEMA FRAGMENT SCHEMA STATS ON FRAGMENTS STATS ON FRAGMENTS LOCAL SCHEMAS LOCAL SCHEMAS Answer 06: Layers of Query Processing Slide 19

Solution-07 Slide 20

Thank You Slide 21