We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byKatelyn Johnston
Modified over 3 years ago
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Types of Distributed Database Systems Homogeneous All sites of the database system have identical setup, i.e., same database system software. The underlying operating system may be different. For example, all sites run Oracle or DB2, or Sybase or some other database system. The underlying operating systems can be a mixture of Linux, Window, Unix, etc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Types of Distributed Database Systems Heterogeneous Federated: Each site may run different database system but the data access is managed through a single conceptual schema. This implies that the degree of local autonomy is minimum. Each site must adhere to a centralized access policy. There may be a global schema. Multidatabase: There is no one conceptual global schema. For data access a schema is constructed dynamically as needed by the application software.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Types of Distributed Database Systems Federated Database Management Systems Issues Differences in data models: Relational, Objected oriented, hierarchical, network, etc. Differences in constraints: Each site may have their own data accessing and processing constraints. Differences in query language: Some site may use SQL, some may use SQL-89, some may use SQL-92, and so on.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Distributed Databases encounter a number of concurrency control and recovery problems which are not present in centralized databases. Some of them are listed below. Dealing with multiple copies of data items Failure of individual sites Communication link failure Distributed commit Distributed deadlock Slide 25- 4
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Details Dealing with multiple copies of data items: The concurrency control must maintain global consistency. Likewise the recovery mechanism must recover all copies and maintain consistency after recovery. Failure of individual sites: Database availability must not be affected due to the failure of one or two sites and the recovery scheme must recover them before they are available for use. Slide 25- 5
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Details (contd.) Communication link failure: This failure may create network partition which would affect database availability even though all database sites may be running. Distributed commit: A transaction may be fragmented and they may be executed by a number of sites. This require a two or three-phase commit approach for transaction commit. Distributed deadlock: Since transactions are processed at multiple sites, two or more sites may get involved in deadlock. This must be resolved in a distributed manner. Slide 25- 6
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control in Distributed Databases Single-Lock-Manager Approach Distributed Lock Manager Primary copy Majority protocol Biased protocol Quorum consensus Slide 25- 7
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Single-Lock-Manager Approach System maintains a single lock manager that resides in a single chosen site, say S i (Primary Site Technique) When a transaction needs to lock a data item, it sends a lock request to S i and lock manager determines whether the lock can be granted immediately If yes, lock manager sends a message to the site which initiated the request If no, request is delayed until it can be granted, at which time a message is sent to the initiating site
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Single-Lock-Manager Approach (Cont.) The transaction can read the data item from any one of the sites at which a replica of the data item resides. Writes must be performed on all replicas of a data item Advantages of scheme: Simple implementation Simple deadlock handling Disadvantages of scheme are: Bottleneck: lock manager site becomes a bottleneck Vulnerability: system is vulnerable to lock manager site failure.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Distributed Lock Manager In this approach, functionality of locking is implemented by lock managers at each site Lock managers control access to local data items But special protocols may be used for replicas Advantage: work is distributed and can be made robust to failures Disadvantage: deadlock detection is more complicated Lock managers cooperate for deadlock detection Several variants of this approach Primary copy Majority protocol Biased protocol Quorum consensus
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Primary Copy Choose one replica of data item to be the primary copy. Site containing the replica is called the primary site for that data item Different data items can have different primary sites When a transaction needs to lock a data item Q, it requests a lock at the primary site of Q. Implicitly gets lock on all replicas of the data item Benefit Concurrency control for replicated data handled similarly to unreplicated data - simple implementation. Drawback If the primary site of Q fails, Q is inaccessible even though other sites containing a replica may be accessible.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Majority Protocol Local lock manager at each site administers lock and unlock requests for data items stored at that site. When a transaction wishes to lock an unreplicated data item Q residing at site S i, a message is sent to S i s lock manager. If Q is locked in an incompatible mode, then the request is delayed until it can be granted. When the lock request can be granted, the lock manager sends a message back to the initiator indicating that the lock request has been granted.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Majority Protocol (Cont.) In case of replicated data If Q is replicated at n sites, then a lock request message must be sent to more than half of the n sites in which Q is stored. The transaction does not operate on Q until it has obtained a lock on a majority of the replicas of Q. When writing the data item, transaction performs writes on all replicas. Benefit Can be used even when some sites are unavailable Drawback Requires 2(n/2 + 1) messages for handling lock requests, and (n/2 + 1) messages for handling unlock requests. Potential for deadlock even with single item - e.g., each of 3 transactions may have locks on 1/3rd of the replicas of a data.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Biased Protocol Local lock manager at each site as in majority protocol, however, requests for shared locks are handled differently than requests for exclusive locks. Shared locks. When a transaction needs to lock data item Q, it simply requests a lock on Q from the lock manager at one site containing a replica of Q. Exclusive locks. When transaction needs to lock data item Q, it requests a lock on Q from the lock manager at all sites containing a replica of Q. Advantage - imposes less overhead on read operations. Disadvantage - additional overhead on writes
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Quorum Consensus Protocol A generalization of both majority and biased protocols Each site is assigned a weight. Let S be the total of all site weights Choose two values read quorum Q r and write quorum Q w Such that Q r + Q w > S and 2 * Q w > S Quorums can be chosen (and S computed) separately for each item Each read must lock enough replicas that the sum of the site weights is >= Q r Each write must lock enough replicas that the sum of the site weights is >= Q w
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Timestamping Timestamp based concurrency-control protocols can be used in distributed systems Each transaction must be given a unique timestamp Main problem: how to generate a timestamp in a distributed fashion Each site generates a unique local timestamp using either a logical counter or the local clock. Global unique timestamp is obtained by concatenating the unique local timestamp with the unique identifier.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Timestamping (Cont.) A site with a slow clock will assign smaller timestamps Still logically correct: serializability not affected But: disadvantages transactions To fix this problem Define within each site S i a logical clock (LC i ), which generates the unique local timestamp Require that S i advance its logical clock whenever a request is received from a transaction Ti with timestamp and x is greater that the current value of LC i. In this case, site S i advances its logical clock to the value x + 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Replication with Weak Consistency Many commercial databases support replication of data with weak degrees of consistency (I.e., without a guarantee of serializabiliy) E.g.: master-slave replication: updates are performed at a single master site, and propagated to slave sites. Propagation is not part of the update transaction: its is decoupled May be immediately after transaction commits May be periodic Data may only be read at slave sites, not updated No need to obtain locks at any remote site Particularly useful for distributing information E.g. from central office to branch-office Also useful for running read-only queries offline from the main database
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Replication with Weak Consistency (Cont.) Replicas should see a transaction-consistent snapshot of the database That is, a state of the database reflecting all effects of all transactions up to some point in the serialization order, and no effects of any later transactions. E.g. Oracle provides a create snapshot statement to create a snapshot of a relation or a set of relations at a remote site snapshot refresh either by recomputation or by incremental update Automatic refresh (continuous or periodic) or manual refresh
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Multimaster and Lazy Replication With multimaster replication (also called update-anywhere replication) updates are permitted at any replica, and are automatically propagated to all replicas Basic model in distributed databases, where transactions are unaware of the details of replication, and database system propagates updates as part of the same transaction Coupled with 2 phase commit Many systems support lazy propagation where updates are transmitted after transaction commits Allows updates to occur even if some sites are disconnected from the network, but at the cost of consistency
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Slide Distributed Concurrency control based on a distributed copy of a data item Primary site technique: A single site is designated as a primary site which serves as a coordinator for transaction management.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Transaction management: Concurrency control and commit are managed by this site. In two phase locking, this site manages locking and releasing data items. If all transactions follow two-phase policy at all sites, then serializability is guaranteed. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Recovery in a Distributed Database Single Lock Manager Approach: (Primary Site Approach) All transaction management activities go to primary site which is likely to overload the site. If the primary site fails, the entire system is inaccessible. To aid recovery a backup site is designated which behaves as a shadow of primary site. In case of primary site failure, backup site can act as primary site. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Primary Copy Technique: In this approach, instead of a site, a data item partition is designated as primary copy. To lock a data item just the primary copy of the data item is locked. Advantages: Since primary copies are distributed at various sites, a single site is not overloaded with locking and unlocking requests. Disadvantages: Identification of a primary copy is complex. A distributed directory must be maintained, possibly at all sites. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Recovery in a Distributed Database Recovery from a coordinator failure In both approaches a coordinator site or copy may become unavailable. This will require the selection of a new coordinator. Primary site approach with no backup site: Aborts and restarts all active transactions at all sites. Elects a new coordinator and initiates transaction processing. Primary site approach with backup site: Suspends all active transactions, designates the backup site as the primary site and identifies a new back up site. Primary site receives all transaction management information to resume processing. Primary and backup sites fail or no backup site: Use election process to select a new coordinator site. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Concurrency Control and Recovery Concurrency control based on voting: There is no primary copy of coordinator. Send lock request to sites that have data item. If majority of sites grant lock then the requesting transaction gets the data item. Locking information (grant or denied) is sent to all these sites. To avoid unacceptably long wait, a time-out period is defined. If the requesting transaction does not get any vote information then the transaction is aborted. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Client-Server Database Architecture Slide It consists of clients running client software, a set of servers which provide all database functionalities and a reliable communication infrastructure.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Client-Server Database Architecture Clients reach server for desired service, but server does reach clients. The server software is responsible for local data management at a site, much like centralized DBMS software. The client software is responsible for most of the distribution function. The communication software manages communication among clients and servers. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Client-Server Database Architecture The processing of a SQL queries goes as follows: Client parses a user query and decomposes it into a number of independent sub-queries. Each subquery is sent to appropriate site for execution. Each server processes its query and sends the result to the client. The client combines the results of subqueries and produces the final result. Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Recap Distributed Database Concepts Data Fragmentation, Replication and Allocation Types of Distributed Database Systems Query Processing Concurrency Control and Recovery 3-Tier Client-Server Architecture
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Chapter 25 Distributed Databases and Client-Server Architectures Copyright © 2004 Pearson Education, Inc.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
04/20/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM. A distributed database system consists of loosely coupled sites that share no physical component Database systems that.
Distributed Transaction Management. Outline Introduction Concurrency Control Protocols Locking Timestamping Deadlock Handling Replication.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Lecture-12 Concurrency Control in Distributed Databases Database system,CSE-313, P.B. Dr. M. A. Kashem Asst. Professor. CSE, DUET, Gazipur.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 22: Distributed.
Distributed Database Concepts Parallel Vs Distributed Technology Advantages Additional Functions Distribution Database Design Data Fragmentation.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Databases Illuminated Chapter 12 Distributed Databases.
Chapter 18.2: Distributed Coordination Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 18 Distributed Coordination Chapter.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Manajemen Basis Data Pertemuan 10 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
Distributed Database Alexandria Dodd Janelle Toungett.
Concurrency Control in Distributed Databases Gul Sabah Arif.
Distributed Databases An Introduction…. Outline What is a Distributed Database? Difference bewteen Distributed and Decentralized DBs Things that encourage.
Distributed Databases. Objectives key terms in the distributed database area Distributed vs. Decentralized Database Homogenous vs. Heterogeneous Decentralized.
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.
1 Distributed Databases. 2 Objectives Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
MBA 664 Database Management Systems Dave Salisbury ( )
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Concurrency Control in Distributed Databases Rucha Patel.
DDBMS Architecture DDBMS Architecture Session-8 Data Management for Decision Support.
BBIT423/CISY423 ADBMS Distributed DBMS, Query Processing and Optimization
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Lecture 7 Part 1. Distributed Databases Part 2. IrisNet Query Processing.
9 Chapter 7 Transaction Management and Concurrency Control.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
CS 582 / CMPE 481 Distributed Systems Replication (cont.)
©Silberschatz, Korth and Sudarshan19.1Database System Concepts 1 Chapter 19: Distributed Databases Heterogeneous and Homogeneous Databases Distributed.
Distributed Databases by Chien-Pin Hsu CS157B Section 1 Nov 11, 2004 Dr. Sin-Min Lee.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
IS 4420 Database Fundamentals Chapter 13: Distributed Databases Leon Chen.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2 Distributed Database (DDB) is a collection of interrelated databases interconnected.
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Distributed Locking. Distributed Locking (No Replication) Assumptions Lock tables are managed by individual sites. The component of a transaction at a.
1 Transaction System Processes. 2 Data Servers Used in LANs, where there is a very high speed connection between the clients and the server, the client.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Lecture 16- Distributed Databases Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
What we will cover… Distributed Coordination 1-1.
© 2017 SlidePlayer.com Inc. All rights reserved.