Distributed Databases

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

V. Megalooikonomou Distributed Databases (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU) Temple University – CIS.
ISOM Distributed Databases Arijit Sengupta. ISOM Learning Objectives Understand the concept and necessity of distributed databases Understand the types.
Enterprise Systems Distributed databases and systems - DT
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Distributed Database Systems Dr. Mohamed Osman Hegazi.
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Advanced Database Systems September 2013 Dr. Fatemeh Ahmadi-Abkenari 1.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Chapter 25 Distributed Databases and Client-Server Architectures Copyright © 2004 Pearson Education, Inc.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
DISTRIBUTED DATABASE. Centralized & Distributed Database  Single site database – centralized database –A database is located at a single site or distributed.
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
Chapter 12 Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed databases
Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
Session-9 Data Management for Decision Support
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Distributed Database Systems Overview
The Evolution of Distributed DBMS 4Social and Technical Changes in the 1980’s u Business operations became more decentralized geographically. u Competition.
DDBMS Distributed Database Management Systems Fragmentation
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
MBA 664 Database Management Systems Dave Salisbury ( )
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
1 Lecture 8 Distributed Data Bases: Replication and Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Chapter 24 Distributed DBMSs – Concepts and Design Pearson Education © 2014.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
Distributed DBMSs – Concepts and Design Chapter 24 in Textbook.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
LM 9. Distributed Database Dr. Lei Li 1. Note: The content of the slides including figures are mainly based on a publicly available textbook chapter:
Chapter 19: Distributed Databases
G063 - Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
Distributed Databases
Presentation transcript:

Distributed Databases Benefits and issues to be considered

What is a Distributed Database? The DB is stored on several computers, interconnected through network. Each site can process local transactions involving data only on that site. Some sites may get involved in global transactions, which access data from several sites.

Examples University of the West Indies Multinational Bursary - Financial information Personnel - Staff information Registry - Student information Multinational HQ in Kingston Manufacturing in Trinidad Warehouse in Miami European HQ in London Each site keeps data on local employees, as well as data relevant to the operation of the site.

Distributed Database Architecture ES1 ES2 ES3 GCS LCS1 LCS2 LCS3 LIS1 LIS2 LIS3

Data Dictionaries A data dictionary in a non-distributed system contains so-called meta-information, information about the data. Examples: structure of tables, data types of attributes. In distributed DBMS’s, data dictionary must also say where the fragments can be found.

Why Distributed Databases I More natural for representing many real world organizations. Local autonomy Local organization is fully responsible for accuracy and safety of its own portion of DB. Improved performance Speed up of query processing Possibility of parallel processing Local processing If data must be accessed often at a site, store it locally.

Why Distributed Databases II Improved availability/reliability DB can continue to function even if some subsystems are down if we replicate data or hardware. Security Avoid destruction of DB by replicating vital data. Incremental growth Increasing size of DB Increasing operations Example: Include R&D facility.

What do we want from a distributed database? No reliance on central master site. Would be a bottleneck. If down, whole system is down. Continuous Operation Enforces reliability and availability Distributed Query Processing A query may require accessing data at a no. of sites. Same data may be available at a no. of sites. Distributed Transaction Management Same copy of a data item may be a a number of sites. Transparency Local Autonomy

Transparency Allow users to use the system as if it is not distributed. Replication transparency Fragmentation transparency Hardware and OS transparency Language and DBMS transparency Applies mostly to multidatabase systems.

Autonomy All operations at a site are controlled by that site. Types of autonomy Design Individual DBs can use data models and transaction management techniques that they prefer. Communication Individual DBs can decide which information they want to make accessible to other sites Execution Individual DBs can decide how to execute transactions submitted to them.

Issues and Problems Distributed database design How should DB and applications be placed across sites. How should data dictionary be placed across sites. Replication and partitioning Distributed query processing How to break down a query into series of data manipulation operations. Reliability If site becomes inaccessible, how do you ensure that DBs at other sites remain consistent and up-to-date?

Replication and Fragmentation Should we maintain several identical copies of (part of) the DB, with each replica stored at a different site? Fragmentation Should we partition a table into separate parts, each stored at a different site? If we do, how should we partition the table?

Replication Advantages: Disadvantages Increased availability If one site goes down, the data may be available from elsewhere. Increased parallelism We may send parts of the same query to different sites. Disadvantages Increased storage space Increased overhead on update. Update needs to be copied to all sites containing the relevant replica.

Fragmentation Why fragment? Why not simply store different complete tables at different sites? Applications usually access only a subset of a relation. Can keep tuples at the site most frequently used. In fragmentation, make sure that we do not lose information.

Types of Fragmentation Horizontal fragmentation Assign different tuples to each fragment (e.g., through a selection in the sense of relational algebra). Vertical fragmentation Assign different attributes to each fragment. Must ensure re-constructability. Mixed Fragmentation Mixture of the two.

Problems in Fragmentation Increased response time if an application needs to access more than one fragment. Especially in vertical fragmentation, ensuring data integrity may become more difficult. Allocation: where to place the various fragments, and whether to replicate it.

Allocation Allocation concerns where to store each fragment and whether to replicate it. Possibilities: Partitioned DB Fragmentation, no replication Partially replicated DB Fragmentation, with each fragment stored at more than one site. Fully replicated DB DB is replicated in full at each site.

Evaluation of Partitioned DB Query processing is moderately difficult, and can be time consuming. Updating of DB is easy. Directory management is moderately difficult Reliability is very low. Requires least amount of space.

Evaluation of Partially Replicated DBs Query processing is moderately difficult, and can be time consuming. Updating of DB is moderately difficult. Directory management is moderately difficult. Reliability is high. Moderate amount of space.

Evaluation of Fully Replicated DBs Query processing is easy, and can be done quickly. Updating of DB is easy but time-consuming. Directory management is easy but time consuming. Reliability is very high. Requires a lot of space.

Query Optimization Each DBMS has a query processor. The query processor takes a high level query (e.g. SQL) and translates it into a set of relational algebraic expressions. Since this can be done in a number of different ways, query processor must choose the best one. This is called query optimization.

Example of query optimization Consider tables: Empl(Eno, Ename, Title) Job(Eno, JobNo, Resp, Dur) Query: SELECT Ename FROM Empl, Job WHERE Empl.Eno = Job.Eno AND Resp = ‘Manager’; Two strategies pename(sResp = ‘Manager’ E J) pename(E s Resp = ‘Manager’(J))

Distributed Query Processing/Optimization Query processing/optimization is more difficult in distributed DBMS Require both global optimization and local optimization A query may require data from more than one site, and communication has a cost. It may be possible to perform some sub-queries in parallel. Cost no longer dependent on only number of tuples accessed.

Example Site 1: E1 = sENO  ‘E3’(E) Site 2: E2 = sENO > ‘E3’(E) Site 3: J1 = sENO  ‘E3’(J) Site 4: J2 = sENO > ‘E3’(J) Results are expected at site 5.

Different Possibilities Strategy 1: Send everything to Site 5 and perform the original query there. Strategy 2: Do selections at sites 3 and 4 Send results to sites 1 and 2 Perform join and projections Send result to site 5. Strategy 2 might seem better but if communication to site 5 is cheap/fast, then 1 may be better.