Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed databases

Similar presentations


Presentation on theme: "Distributed databases"— Presentation transcript:

1 Distributed databases
A brief introduction (Figure numbers may not be the same as in the book) Distributed databases

2 Distributed database concepts
Distributed database (DDB) Collection of multiple logically interrelated databases distributed over a computer network Distributed database management systems (DDBMS) Software systems managing a distributed database, making distribution transparent to the users Distributed databases

3 Distributed databases
Transparency Hiding implementation details from the users of the database Data organization transparency Location transparency Use does not depend on location Naming transparency Naming is independent from location Replication transparency Copies can be kept for availability, performance, and availability User are unaware of the existence of these copies Fragmentation transparency One table is divided into more locations Horizontal fragmentation Table divided by rows Vertical fragmentation Table divided by columns Distributed databases

4 Example: Replication and horizontal fragmentation
Distributed databases

5 Reliability and Availability
Two common advantages of distributed databases Reliability The probability that a system is running at a certain time point Availability The probability that a system is continuously available during a time interval Distributed databases

6 Advantages of distributed databases
Improved ease and flexibility of application development Transparency: Developers do not have to know … Increased reliability and availability Faults are isolated to a single site Improved performance Data localization, means less network traffic Parallelism Easier expansion Easy to add more data, processors, etc. Distributed databases

7 Types of distributed database systems
Degree of homogeneity Homogeneous: All local DBMSs run identical software Heterogeneous: Local DMBSs run different software Autonomy Local autonomy: Local site can function as a standalone DBMS No autonomy: Local site can not function as a standalone DBMS Distributed databases

8 Classification of distributed databases

9 Database system architectures
Distributed databases

10 Distributed databases
General architecture Distributed databases

11 Component architecture of distributed databases

12 Distributed databases
Data fragmentation Which site should store which portion of the database? Simple fragmentation Each site has a whole relation Horizontal fragmentation Subset of rows in each site Sometimes based on location Vertical fragmentation Subset of columns in each site Primary key must be in all sites Mixed / hybrid fragmentation Horizontal + vertical fragmentation Described by fragmentation schema Distributed databases

13 Example fragmentation
Distributed databases

14 Example fragmentation, continued
Distributed databases

15 Distributed databases
Data replication Replication to improve availability Fully replicated database All data is replicated to each site Non replication All data is stored at exactly one site Partial replication Some data is replicated to some sites Described by replication schema Distributed databases

16 Distributed query processing
Query mapping Query mapped from SQL to relational algebra using the global conceptual schema Localization Map query on the global schema to separate queries on the local schemas Using fragmentation and replication information Global query optimization Cost = CPU time + I/O time + communication time Local query optimization Same as in centralized databases Distributed databases

17 Distributed transaction management, Two-phase commit protocol (2PC)
Global transaction manager / coordinator Coordinates the results of local transaction managers. All local transaction managers must be able to ”commit”, before actually doing the ”commit” Two-Phase commit protocol (2PC) Phase 1 Individual databases tell the coordinator that they have finished transaction All individual databases have finished: Coordinator sends ”prepare for commit” to all databases Individual databases answer ”read to commit” or ”cannot commit” Phase 2 If all databases answered ”ready to commit”, coordinator sends ”commit” to all databases If one (or more) databases answered ”cannot commit”, coordinator sends ”abort” to all databases. Timeout: if one (or more) databases does not answer within a given amount of time, coordinator sends ”abort”. Distributed databases

18 Two-phase commit protocol (2PC)
Problems with 2PC Coordinator crashes: All participating sites are waiting No way of knowing whether participating sites really got the ”commit” / ”abort” Distributed databases

19 Three-phase commit (3PC)
Distributed databases


Download ppt "Distributed databases"

Similar presentations


Ads by Google