Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.

Slides:



Advertisements
Similar presentations
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.
Advertisements

Database Systems: Design, Implementation, and Management
Enterprise Systems Distributed databases and systems - DT
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Distributed Database Systems Dr. Mohamed Osman Hegazi.
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Advanced Database Systems September 2013 Dr. Fatemeh Ahmadi-Abkenari 1.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Chapter 25 Distributed Databases and Client-Server Architectures Copyright © 2004 Pearson Education, Inc.
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Databases
Distributed databases
Distributed Databases
Distributed Database and Replication. Distributed Database A logically interrelated collection of shared data and a description of this data physically.
Distributed Databases and DBMSs: Concepts and Design
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Distributed DBMSs - Concepts and Design Transparencies
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
Session-9 Data Management for Decision Support
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
© Pearson Education Limited, Chapter 15 Physical Database Design – Step 7 (Consider Introduction of Controlled Redundancy) Transparencies.
Distributed Databases Midterm review. Lectures covered Everything until (including) March 2 nd Everything until (including) March 2 nd Focus on distributed.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
DDBMS Distributed Database Management Systems Fragmentation
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed Database System
Distributed database system
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Distributed DBMS, Query Processing and Optimization
1 Lecture 10: Distributed Databases – Replication and Fragmentation Advanced Databases CG096 Nick Rossiter.
1 Lecture 8 Distributed Data Bases: Replication and Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Distributed DBMSs – Concepts and Design Chapter 24 in Textbook.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
DISTRIBUTED DATABASES AND DDBMS. Learning Objectives  Describe various DDBMS implementations  Explain how database design affects the DDBMS environment.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed Databases and Client-Server Architectures
Distributed Database Concepts
Distributed Database Management Systems
Chapter 19: Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
Presentation transcript:

Distributed Database

Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an organization and to provide controlled access to the data. Although integration and controlled access may imply centralization, this is not the intention. In fact, the development of computer networks promotes a decentralized mode of work. This decentralized approach mirrors the organizational structure of many companies, which are logically distributed into divisions, departments, projects, and so on, and physically distributed into offices, plants, factories, where each unit maintains its own operational data. The sharing ability of the data and the efficiency of data access should be improved by the development of a distributed database system that reflects this organizational structure, makes the data in all units accessible, and stores data proximate to the location where it is most frequently used.

Distributed DBMS The software system that permits the management of the distributed database and makes the distribution transparent to users. A Distributed Database Management System (DDBMS) consists of a single logical database that is split into a number of fragments. Each fragment is stored on one or more computers under the control of a separate DBMS, with the computers connected by a communications network. Each site is capable of independently processing user requests that require access to local data and is also capable of processing data stored on other computers in the network. Users access the distributed database via applications. Applications are classified as those that do not require data from other sites (local Applications) and those that do require data from other sites (global applications). We require a DDBMS to have at least one global application.

Banking Example Using distributed database technology, a bank may implement their database system on a number of separate computer systems rather than a single, centralized mainframe. The computer systems may be located at each local branch office: for example, Amritsar, Patiala, and Qadian. A network linking the computer will enable the branches to communicate with each other, and DDBMS will enable them to access data stored at another branch office. Thus, a client living in Amritsar can also check his/her account during the stay in Patiala or Qadian.

Distributed Relational Database Design In this section we examine the factors that have to be considered for the design of a distributed relational database. More specifically, we examine:  Fragmentation A relation may be divided into a number of subrelations, called fragments, which are the distributed. There are two main types of fragmentation: 1) Horizontal fragmentation 2) Vertical fragmentation

 AllocationEach fragment is stored at the site with ‘optimal’ distribution. ReplicationThe DDBMS may maintain a copy of a fragment at several different sites. The definition and allocation of fragments must be based on how the database is to be used. This involves analyzing transactions. The design should be based on both quantitative and qualitative information. Quantitative information is used in allocation; qualitative information is used in fragmentation. The quantitative information may include:  The frequency with which a transaction is run.  The site from which a transaction is run.  The performance criteria for transactions.

The qualitative information may include information about the transaction that are following objectives: Locality of reference Improved reliability and availability Acceptable performance Balanced storage capacities and costs Minimal communication costs

Data Allocation There are four alternative strategies regarding the placement of data:  Centralized  Fragmented  Complete replication  Selective replication. We now compare these strategies using the strategic objective identified above.

Centralized This strategy consists of a single database and DBMS stored at one site with users distributed across the network (we referred to this previously as distributed processing). Locality of reference is at its lowest as all sites, except the central site, have to use the network for all data accesses. This also means that communication costs are high. Reliability and availability are low, as a failure of the central site results in the loss of the entire database system. Fragmented (or partitioned) This strategy partitions the database into disjoint fragments, with each fragment assigned to one site. If data items are located at the site where they are used most frequently, locality of reference is high. As there is no replication, storage cost are low; similarly, reliability and availability are low, although they are higher than in the centralized case; as the failure of a site results in the loss of only that site’s data. Performance should be good and communications costs low if the distribution is designed properly.

Advantages of fragmentation Usage Efficiency Parallelism Security Disadvantages of fragmentation Performance Integrity

Data Fragmentation If relation r is fragmented, r is divided into a number of fragments r1, r2 ……rn. These fragments contain sufficient information to allow reconstruction of the original relation r. As we shall see, this reconstruction can take place through the application of either the union operation or a special type of join operation on the various fragments. There are three different schemes for fragmenting a relation:  Horizontal fragmentation  Vertical fragmentation  Mixed fragmentation We shall illustrate these approaches by fragmenting the relation document, with schema: EMP (EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO)

Horizontal Fragmentation In horizontal fragmentation, the relations (tables) are divided horizontally. That is some of the tuples of the relation is placed in one computer and rest are placed in other computers. A horizontal fragment is a subset of the total tuples in that relation To construct the relation R from various horizontal fragments, a UNION operation can be performed on the fragments. Such a fragment containing all the tuples of relation R is called a complete horizontal fragment.

For example, suppose that the relation r is the EMP relation of above. This relation can be divided into n different fragments, each of which consists of tuples of employee belonging to a particular department. EMP relation has three departments 10,20 and 30 results three different fragments: EMP1=  DEPTNO =10 (EMP) EMP2=  DEPTNO =20 (EMP) EMP3=  DEPTNO =30 (EMP) These three fragments are shown below. Fragment r1 is stored in the department number 10 site, fragment r2 is stored in the department number 20 site and so on r3 is stored at department number 30 site. These fragments are shown below:

We obtain the reconstruction of the relation r by taking the union of all fragments; that is, R=r1  r2  …..  rn

Vertical Fragmentation In vertical fragmentation, some of the columns (attributes) are stored in one computer and rest are stored in other computers. This is because each site may not need all the attributes of a relation. A vertical fragment keeps only certain attributes of the relation. The fragmentation should be done such that we can reconstruct relation r from the fragments by taking the natural join r=r 1 *r 2 *r 3 ………r n

Mixed Fragmentation Mixed fragmentation, also known as Hybrid fragmentation, intermixes the horizontal and vertical fragmentation. The relation r is divided into a number of fragment relations r1, r2……..rn. Each fragment is obtained as the result of application of either the horizontal fragmentation or vertical fragmentation scheme on relation r, or on a fragment of r that was obtained previously. For example, if we can combine the horizontal and vertical fragmentation of the EMP relation, it will result into a mixed fragmentation. This relation is divided initially into the fragments EMP1 and EMP2 as vertical fragments. We can now further divide fragment EMP1 using the horizontal-fragmentation scheme, into the following two fragments: EMP1a=  DEPTNO= 10 (EMP1) EMP2a=  DEPTNO= 20 (EMP2) EMP3a=  DEPTNO= 30 (EMP3)

Data Replication and Fragmentation The techniques described for data replication and data fragmentation can be applied successively to the same relation. That is, a fragment can be replicated, replicas of fragments can be fragmented further, and so on. For example, consider a distributed system consisting of sites S1, S2…….S11. We can fragment EMP into EMP1a, EMP2a and EMP2, and for example, store a copy of EMP1a at sites S1, S3 and S7; a copy of EMP2a at sites S4 and S11; and a copy of EMP2 at sites S2, S8 and S9.

Complete replication This strategy consists of maintaining a complete copy of the database at each site. Therefore, locality of reference, reliability and availability, and performance are maximized. However, storage costs and communication costs for updates are the most expensive. To overcome some of these problems, snapshots are sometimes used. A snapshot is a copy of the data at a given time. The copies are updated periodically, for example, hourly or weekly, so they may not be always up to date. Snapshots are also sometimes used to implement views in a distributed database to improve the time it takes to perform a database operation on a view. Selective replication This strategy is a combination of fragmentation, replication and centralized. Some data items are fragmented to achieve high locality of reference and others, which are used at many sites and are not frequently updated, are replicated; otherwise, the data items are centralized. The objective of this strategy is to have all the advantages of the other approaches but none of the disadvantages. This is the most commonly used strategy because of its flexibility.