Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

V. Megalooikonomou Distributed Databases (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU) Temple University – CIS.
ISOM Distributed Databases Arijit Sengupta. ISOM Learning Objectives Understand the concept and necessity of distributed databases Understand the types.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
Chapter 13 (Web): Distributed Databases
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Chapter 25 Distributed Databases and Client-Server Architectures Copyright © 2004 Pearson Education, Inc.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems
Overview Distributed vs. decentralized Why distributed databases
Distributed Databases
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
Chapter 12 Distributed Database Management Systems
Alexandria Dodd Janelle Toungett
Distributed Databases
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Multi-user Database Processing Architectures Architectures Transactions Transactions Security Security Administration Administration.
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
Database Design – Lecture 16
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts 1 Chapter 19: Distributed Databases Heterogeneous and Homogeneous Databases Distributed.
Session-8 Data Management for Decision Support
Lecture 16- Distributed Databases Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
DISTRIBUTED DATABASES
Distributed Database Systems Overview
Unit 9 Transaction Processing. Key Concepts Distributed databases and DDBMS Distributed database advantages. Distributed database disadvantages Using.
DISTRIBUTED COMPUTING
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
1 Distributed Databases (DDBs) Chap Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Distributed Databases DBMS Textbook, Chapter 22, Part II.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Databases Illuminated
1 Distributed Databases Chapter 21, Part B. 2 Introduction v Data is stored at several sites, each managed by a DBMS that can run independently. v Distributed.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
MBA 664 Database Management Systems Dave Salisbury ( )
Chapter 19 Distributed Databases. 2 Distributed Database System n A distributed DBS consists of loosely coupled sites that share no physical component.
Distributed Database: Part 2. Distributed DBMS Distributed database requires distributed DBMS Distributed database requires distributed DBMS Functions.
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Distributed DBMS, Query Processing and Optimization
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Distributed Databases and Client-Server Architectures
CHAPTER 25 - Distributed Databases and Client–Server Architectures
Distributed Database Concepts
6/25/2018.
Database System Implementation CSE 507
Commit Protocols CS60002: Distributed Systems
Outline Announcements Fault Tolerance.
CSIS 7102 Spring 2004 Lecture 6: Distributed databases
Distributed Databases
A View over Distributed databases
Distributed Databases Recovery
Introduction of Week 14 Return assignment 12-1
Distributed Databases
Presentation transcript:

Introduction to Distributed Databases Yiwei Wu

Introduction A distributed database is a database in which portions of the database are stored on multiple computers within a network. Centralized DB Distributed DB

Introduction – Cont. Advantages: Reflects organizational structure Local autonomy Improved availability Improved performance Economics Modularity Disadvantages: Complexity Economics Security Difficult to maintain integrity Inexperience

Types of DDBS Homogeneous Uses one DBMS for all the servers in the system(eg: Oracle or MS-SQL ). Heterogeneous Uses two or more different DBMS's for different database servers(eg: Oracle and MS-SQL and postgresql).

Data Fragmentation Horizontal fragments subsets of tuples (rows) from a relation (table). Vertical fragments subsets of attributes (columns) from a relation (table). Mixed fragment a fragment which is both horizontally and vertically fragmented.

Replication fully replication the whole database is replicated at every site in the distributed system no replication each fragment is stored at exactly one site partial replication some fragments of the database may be replicated whereas others may not

Query Processing Site1 10,000 records, 100 bytes each R(Employee)=(Fname, Lname, SSN, ….. Dno) Site2 100 records, 35 bytes each R(Department)=(Dnumber, Dname,….) Q: Site 1 Employee Site 2 Department Site 3 Result

Distributed Query Transfer Employee to site3 Transfer Department to site3 Perform join at site3 Cost: 1,000, = 1,003,500 bytes

Semijoin The idea of using the semijoin operation is to reduce the number of tuples in a relation before transferring it to another site. Project the join attribute of Department at site 2 and transfer to site1. Cost = 4*100 Join with the employee at site 1, and transfer back to site3 Cost = 34*10,000 Total Cost = 340,400 bytes

Transaction Two phase commit protocol: Phase 1: Obtaining a Decision Coordinator asks all participants to prepare to commit transaction Ti. Ci adds the records to the log and forces log to stable storage sends messages to all sites at which T executed Upon receiving message, transaction manager at site determines if it can commit the transaction if not, add a record to the log and send abort T message to Ci if the transaction can be committed, then: add the record to the log force all records for T to stable storage send ready T message to Ci

Two phase commit protocol–Cont. Phase 2: Recording the Decision T can be committed of Ci received a ready T message from all the participating sites: otherwise T must be aborted. Coordinator adds a decision record, or, to the log and forces record onto stable storage. Once the record stable storage it is irrevocable (even if failures occur) Coordinator sends a message to each participant informing it of the decision (commit or abort) Participants take appropriate action locally.

Concurrency Control – algorithms Pessimistic synchronize the execution of user requests before the transaction starts E.g. Two-phase locking protocol, Timestamp ordering protocol Optimistic execute the requests and then perform a validation check to ensure that the execution has not compromised the consistency of the database E.g. Locking based and Timestamp ordering based

Concurrency Control –Replication primary site technique -- it is a simple extension of the centralized locking approach. primary site with backup site -- All locking information is maintained at both the primary and the backup sites primary copy technique -- Failure of one site only affects any transactions that are accessing locks on items whose primary copies reside at that site, but other transactions are not affected.

Deadlock Handling Centralized Approach: A global wait-for graph is constructed and maintained in a single site which is the deadlock-detection coordinator. Local wait-for graph Global wait-for graph

Recovery it is quite difficult to determine whether a site is down without exchanging numerous messages with other sites. When a transaction is updating data at several sites, it cannot commit until it is sure that the effect of the transaction on every site cannot be lost. The two-phase commit protocol is often used to ensure the correctness of distributed commit.

3-tier Client-Server Architecture The first, or presentation tier, (the client or front-end), deals with the interaction with the user. The second, processes the requests of all clients. The third or database tier contains the database management system that manages all persistent data.

3-tier Architecture – Cont.

Summary Distributed DBMS offer site autonomy and distributed administration. Must revisit storage techniques, concurrency control, and recovery issues

Q&A

Thank You