III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

Database Architectures and the Web
Enterprise Systems Distributed databases and systems - DT
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Manajemen Basis Data Pertemuan 9 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Distributed Database Management Systems
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Databases
Outline Introduction Background Distributed Database Design
Distributed databases
Alexandria Dodd Janelle Toungett
Distributed Databases
Distributed Database and Replication. Distributed Database A logically interrelated collection of shared data and a description of this data physically.
Distributed Databases and DBMSs: Concepts and Design
Distributed DBMSs - Concepts and Design Transparencies
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Distributed DBMSs - Concepts and Design Transparencies
Database Design – Lecture 16
ENTERPRISE PROGRAMMING
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
Session-9 Data Management for Decision Support
Distributed and mobile DBMSs Transparencies. ©Pearson Education 2009 Chapter 16 - Objectives Main concepts of distributed DBMSs (DDBMSs) Differences between.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed systems and Distributed databases design Enterprise systems DT
Distributed Database Systems Overview
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
The Evolution of Distributed DBMS 4Social and Technical Changes in the 1980’s u Business operations became more decentralized geographically. u Competition.
DDBMS Distributed Database Management Systems Fragmentation
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
Distributed database system
CS742 – Distributed & Parallel DBMSM. Tamer Özsu Page 1.1 Outline Introduction & architectural issues What is a distributed DBMS Problems Current state-of-affairs.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Chapter 24 Distributed DBMSs – Concepts and Design Pearson Education © 2014.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Distributed DBMSs – Concepts and Design Chapter 24 in Textbook.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed DBMSs - Concepts and Design
Distributed Database Concepts
Distributed DBMS Concepts of Distributed DBMS
Chapter 19: Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
Distributed Databases
Presentation transcript:

III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer: Chris Clack 3C13/D6

III. Current Trends: 1 - Distributed DBMSsSlide 2/ Content 12.1 Objectives 12.2 Overview of Networking 12.3 Introduction to DDBMSs - Concepts - Advantages and Disadvantages - Homogeneous and Heterogeneous 12.4 Functions and Architecture - Functions of a DDBMS - Reference Architecture for a DDBMS/ Federated MDBS 12.5 Distributed Relational Database Design - Data Allocation - Fragmentation Content 12.6 Transparency in a DDBMS - Distribution Transparency - Transaction Transparency - Performance Transparency 12.7 Date’s 12 Rules for DDBMs 12.8 Summary

III. Current Trends: 1 - Distributed DBMSsSlide 3/ Objectives Objectives In this Lecture you will learn: Concepts. Advantages and disadvantages of distributed databases. Functions and architecture for a DDBMS. Distributed database design. Levels of transparency. Comparison criteria for DDBMSs.

III. Current Trends: 1 - Distributed DBMSsSlide 4/ Overview of Networking Overview of Networking Network: interconnected collection of autonomous computers, capable of exchanging information. Local Area Network (LAN) intended for connecting computers at same site. Wide Area Network (WAN) used when computers or LANs need to be connected over long distances. WAN relatively slow Less reliable than LANs. DDBMS using LAN provides much faster response time than one using WAN.

III. Current Trends: 1 - Distributed DBMSsSlide 5/ Overview of Networking Overview of Networking Network: interconnected collection of autonomous computers, capable of exchanging information. Local Area Network (LAN) intended for connecting computers at same site. Wide Area Network (WAN) used when computers or LANs need to be connected over long distances. WAN relatively slow Less reliable than LANs. DDBMS using LAN provides much faster response time than one using WAN.

III. Current Trends: 1 - Distributed DBMSsSlide 6/ Introduction Concepts Databases and networks: 1.A centralized DBMS could be physically processed by several computers distributed across a network 2.There could be several separate DBMS on several computers distributed across a network 3.There may be a Distributed DBMS (DDBMS) made up of several DBMSs distributed across a network each with local autonomy Each participates in at least one global DBMS action The DDBMS therefore can operate as a single global DBMS

III. Current Trends: 1 - Distributed DBMSsSlide 7/ Introduction Concepts DDBMS to Avoid `islands of information’ problem… A “Distributed Database”: is a logically interrelated collection of shared data (and a description of this data), physically distributed over a computer network. A “Distributed DBMS” (DDBMS): is a Software system that permits the management of the distributed database and makes the distribution transparent to users. Fundamental Principle: make distribution transparent to user. The fact that fragments are stored on different computers is hidden from the users

III. Current Trends: 1 - Distributed DBMSsSlide 8/ Introduction Concepts DDBMS has following characteristics: Collection of logically-related shared data. Data split into fragments. Fragments may be replicated. Fragments/replicas allocated to sites. Sites linked by a communication network. Data at each site is under control of a DBMS. DBMSs handle local applications autonomously. Each DBMS participates in at least one global application.

III. Current Trends: 1 - Distributed DBMSsSlide 9/ Introduction Important difference between DDBMS and distributed processing ! DDBMS Distributed processing of centralised DBMS

III. Current Trends: 1 - Distributed DBMSsSlide 10/ Introduction Distributed processing of a centralised DBMS has following characteristics : Much more tightly coupled than a DDBMS. Database design is same as for standard DBMS No attempt to reflect organisational structure Much simpler than DDBMS More secure than DDBMS No local autonomy

III. Current Trends: 1 - Distributed DBMSsSlide 11/ Introduction Important difference between DDBMS and parallel database DDBMS Parallel Database Architectures: Shared: a)memory b)disk c)nothing

III. Current Trends: 1 - Distributed DBMSsSlide 12/ Introduction Why use a DDBMS? (!) Advantages: Reflects organizational structure Improved shareability and local autonomy Improved availability Improved reliability Improved performance Economics Modular growth Disadvantages: Complexity Cost Security Integrity control more difficult Lack of standards Lack of experience Database design more complex

III. Current Trends: 1 - Distributed DBMSsSlide 13/ Introduction Homogeneous & Heterogeneous DDBMSs Homogeneous: All sites use same DBMS product. Much easier to design and manage. Approach provides incremental growth Allows increased performance. Heterogeneous: Sites may run different DBMS products, underlying data models. Sites implemented their own databases - integration considered later. Translations required to allow for Typical solution is to use gateways. Different hardware. Different DBMS products. Different hardware and DBMS products.

III. Current Trends: 1 - Distributed DBMSsSlide 14/ Introduction Open Database access and interoperability “The Open Group” formed Specification Working Group (SWG) to provide specifications that create database infrastructure environment where there is: Common SQL API :allows client applications to be written that do not need to know vendor of DBMS they are accessing. Common database protocol: enables DBMS from one vendor to communicate directly with DBMS from another vendor without need for a gateway. Common network protocol: allows communications between different DBMSs.

III. Current Trends: 1 - Distributed DBMSsSlide 15/ Introduction Multidatabase system (MDBS)! MDBS: DDBMS where each site maintains complete autonomy Resides transparently on top of existing database and file systems presents a single database to its users. Allows users to access and share data without requiring physical database integration. 2 types: Federated MDBS: looks like a DDBMS for global users and a centralized DBMS for local users. Unfederated MDBS: has no “local” users

III. Current Trends: 1 - Distributed DBMSsSlide 16/ Functions and Architecture of a DDBMS Functions and Architecture of a DDBMS

III. Current Trends: 1 - Distributed DBMSsSlide 17/ Functions and Architecture of a DDBMS Functions of a DDBMS Expect DDBMS to have at least the functionality of a DBMS. Also to have following functionality: Extended communication services. Extended Data Dictionary. Distributed query processing. Extended concurrency control. Extended recovery services.

III. Current Trends: 1 - Distributed DBMSsSlide 18/ Functions and Architecture of a DDBMS DDBMS Reference Architecture A reference architecture consists of: Set of global external schemas. Global conceptual schema (GCS). Fragmentation schema and allocation schema (see later …) Set of schemas for each local DBMS conforming to 3-level ANSI/SPARC. Comparison with federated MDBS: In DDBMS: GCS is union of all local conceptual schemas. In FMDBS: GCS is subset of local conceptual schemas (LCS), consisting of data that each local system agrees to share. GCS of tightly coupled system involves integration of either parts of LCSs or local external schemas. FMDBS with no GCS is called loosely coupled.

III. Current Trends: 1 - Distributed DBMSsSlide 19/ Functions and Architecture of a DDBMS Distributed Relation Database Design

III. Current Trends: 1 - Distributed DBMSsSlide 20/ Distributed Relational Database Design Data Allocation ! Four alternative strategies regarding placement of data: Centralized: single database and DBMS stored at one site with users distributed across the network. Partitioned: Database partitioned into disjoint fragments, each fragment assigned to one site. Complete Replication: Consists of maintaining complete copy of database at each site. Selective Replication: Combination of partitioning, replication, and centralization. Comparison of strategies

III. Current Trends: 1 - Distributed DBMSsSlide 21/ Distributed Relational Database Design Data Allocation Four alternative strategies regarding placement of data: Centralized: single database and DBMS stored at one site with users distributed across the network. Partitioned: Database partitioned into disjoint fragments, each fragment assigned to one site. Complete Replication: Consists of maintaining complete copy of database at each site. Selective Replication: Combination of partitioning, replication, and centralization. Comparison of strategies

III. Current Trends: 1 - Distributed DBMSsSlide 22/ Distributed Relational Database Design Fragmentation Why fragment? Usage: - Apps work with views rather than entire relations. Efficiency: - Data stored close to where most frequently used. - Data not needed by local applications is not stored. Security: - and so not available to unauthorized users. Parallelism: - With fragments as unit of distribution, T can be divided into several subqueries that operate on fragments. Disadvantages: Performance & Integrity.

III. Current Trends: 1 - Distributed DBMSsSlide 23/ Distributed Relational Database Design Fragmentation ! Three Correctness of fragmentation rules: 1.Completeness: If relation R decomposed into fragments R 1, R 2,... R n, each data item that can be found in R must appear in at least one fragment. 2.Reconstruction: Must be possible to define a relational operation that will reconstruct R from the fragments. - for horizontal fragmentation: Union operation - for vertical: Join 3. Disjointness: If data item d i appears in fragment R i, then should not appear in any other fragment. - Exception: vertical fragmentation. - For horizontal fragmentation, data item is a tuple. - For vertical fragmentation, data item is an attribute.

III. Current Trends: 1 - Distributed DBMSsSlide 24/ Distributed Relational Database Design Fragmentation ! Four types of fragmentation: 1. Horizontal: Consists of a subset of the tuples of a relation. - Defined using Selection operation - Determined by looking at predicates used by Ts. - Involves finding set of minimal (complete and relevant) predicates. - Set of predicates is complete, iff, any two tuples in same fragment are referenced with same probability by any application. - Predicate is relevant if there is at least one application that accesses fragments differently.

III. Current Trends: 1 - Distributed DBMSsSlide 25/ Distributed Relational Database Design Fragmentation ! Four types of fragmentation: 2. Vertical: subset of atts of a relation. - Defined using Projection operation - Determined by establishing affinity of one attribute to another. 3. Mixed: horizontal fragment that is vertically fragmented, or a vertical fragment that is horizontally fragmented. - Defined using Selection and Projection operations 4. Derived: horizontal fragment that is based on horizontal fragmentation of a parent relation. - Ensures fragments frequently joined together are at same site. - Defined using Semijoin operation Other possibility is no fragmentation: -If relation is small and not updated frequently, may be better not to fragment.

III. Current Trends: 1 - Distributed DBMSsSlide 26/ Distributed Relational Database Design Transparency in a DDBMS Transparency hides implementation details from users. Overall objective: equivalence to user of DDBMs to centralised DBMS - FULL transparency not universally accepted objective Four main types: 1. Distribution transparency 2. Transaction transparency 3. Performance transparency 4. DBMS transparency (only applicable to heterogeneous)

III. Current Trends: 1 - Distributed DBMSsSlide 27/ Distributed Relational Database Design 1. Distribution Transparency Distribution transparency: allows user to perceive database as single, logical entity. If DDBMS exhibits distribution transparency, user does not need to know: fragmentation transparency: data is fragmented Location transparency: location of data items otherwise call this local mapping transparency replication transparency: user unaware of replication of fragments Naming transparency: each item in a DDB must have a unique name. - One solution: create central name server - loss of some local autonomy. - central site may become a bottleneck. - low availability: if the central site fails. Alternative solution: prefix object with identifier of creator site, each fragment and its copies. Then each site uses alias.

III. Current Trends: 1 - Distributed DBMSsSlide 28/ Distributed Relational Database Design 2. Transaction Transparency Transaction transparency: Ensures all distributed Ts maintain distributed database’s integrity and consistency. Distributed T accesses data stored at more than one location. Each T is divided into no. of subTs, one for each site that has to be accessed. DDBMS must ensure the indivisibility of both the global T and each of the subTs.

III. Current Trends: 1 - Distributed DBMSsSlide 29/ Distributed Relational Database Design 2. Transaction Transparency Concurrency transparency: All Ts must execute independently and be logically consistent with results obtained if Ts executed in some arbitrary serial order. Replication makes concurrency more complex Failure transparency: must ensure atomicity and durability of global T. Means ensuring that subTs of global T either all commit or all abort. Classification transparency: In IBM’s Distributed Relational Database Architecture (DRDA), four types of Ts: –Remote request –Remote unit of work –Distributed unit of work –Distributed request.

III. Current Trends: 1 - Distributed DBMSsSlide 30/ Distributed Relational Database Design 3. Performance Transparency DDBMS: - no performance degradation due to distributed architecture. - determine most cost-effective strategy to execute a request. Distributed Query Processor (DQP) maps data request into ordered sequence of operations on local databases. - Must consider fragmentation, replication, and allocation schemas. DQP has to decide: 1.which fragment to access 2.which copy of a fragment to use 3.which location to use. - produces execution strategy optimized with respect to some cost function. Typically, costs associated with a distributed request include: I/O cost; CPU cost, communication cost.

III. Current Trends: 1 - Distributed DBMSsSlide 31/ Dates 12 Rules for DDBMS Date’s 12 Rules for DDBMS Fundamental Principle: To the user, distributed system should look exactly like a nondistributed system. 1. Local Autonomy 2. No Reliance on a Central Site 3. Continuous Operation 4. Location Independence 5. Fragmentation Independence 6. Replication Independence 7. Distributed Query Processing 8. Distributed Transaction Processing Ideals: 9. Hardware Independence 10. Operating System Independence 11. Network Independence 12. Database Independence

III. Current Trends: 1 - Distributed DBMSsSlide 32/ Summary Summary NEXT LECTURE: III Current Trends Part 2: Distributed DBMSs- Advanced concepts - advanced concepts - protocols for distributed deadlock control - X/Open Distributed Transaction Processing Model - Oracle Objectives 12.2 Overview of Networking 12.3 Introduction to DDBMSs Concepts Advantages and Disadvantages Homogeneous and Heterogeneous 12.4 Functions and Architecture Functions of a DDBMS Reference Architecture for a DDBMS/ Federated MDBS 12.5 Distributed Relational Database Design Data Allocation Fragmentation 12.6 Transparency in a DDBMS - Distribution Transparency - Transaction Transparency - Performance Transparency 12.7 Date’s 12 Rules for DDBMs