CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Distributed Database Systems (R&G.

Slides:



Advertisements
Similar presentations
Andy Pavlo April 13, 2015April 13, 2015April 13, 2015 NewS QL.
Advertisements

V. Megalooikonomou Distributed Databases (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU) Temple University – CIS.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#26: Database Systems.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems
Overview Distributed vs. decentralized Why distributed databases
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
Chapter 13 (Web): Distributed Databases
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#28: Modern Database Systems.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: OldSQL vs. NoSQL vs. NewSQL.
Distributed Databases
Databases with Scalable capabilities Presented by Mike Trischetta.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Database Design – Lecture 16
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed Database Systems Overview
DISTRIBUTED COMPUTING
Distributed Databases DBMS Textbook, Chapter 22, Part II.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Distributed DB.
Distributed Databases
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
MBA 664 Database Management Systems Dave Salisbury ( )
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#24: Distributed Database Systems (R&G.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Distributed DBMS, Query Processing and Optimization
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 11 Distributed Databases Modern Database Management 9 th Edition Jeffrey A. Hoffer,
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
Distributed Databases
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Chapter 12 Distributed Database Management Systems
Database System Implementation CSE 507
Introduction to NewSQL
Chapter 19: Distributed Databases
Lecture#24: Distributed Database Systems
Outline Announcements Fault Tolerance.
Distributed Databases
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
Introduction of Week 14 Return assignment 12-1
Presentation transcript:

CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Distributed Database Systems (R&G ch. 22)

CMU SCS Administrivia – Final Exam Who: You What: R&G Chapters When: Monday May 11th 5:30pm ‐ 8:30pm Where: GHC 4401 Why: Databases will help your love life. Faloutsos/PavloCMU SCS /6152

CMU SCS Today’s Class High-level overview of distributed DBMSs. Not meant to be a detailed examination of all aspects of these systems. Faloutsos/PavloCMU SCS /6153

CMU SCS System Votes Faloutsos/PavloCMU SCS /6154

CMU SCS Today’s Class Overview & Background Design Issues Distributed OLTP Distributed OLAP Faloutsos/PavloCMU SCS /6155

CMU SCS Why Do We Need Parallel/Distributed DBMSs? PayPal in 2008… Single, monolithic Oracle installation. Had to manually move data every xmas. Legal restrictions. Faloutsos/PavloCMU SCS /6156

CMU SCS Why Do We Need Parallel/Distributed DBMSs? Increased Performance. Increased Availability. Potentially Lower TCO. Faloutsos/PavloCMU SCS /6157

CMU SCS Parallel/Distributed DBMS Database is spread out across multiple resources to improve parallelism. Appears as a single database instance to the application. –SQL query for a single-node DBMS should generate same result on a parallel or distributed DBMS. Faloutsos/PavloCMU SCS /6158

CMU SCS Parallel vs. Distributed Parallel DBMSs: –Nodes are physically close to each other. –Nodes connected with high-speed LAN. –Communication cost is assumed to be small. Distributed DBMSs: –Nodes can be far from each other. –Nodes connected using public network. –Communication cost and problems cannot be ignored. Faloutsos/PavloCMU SCS /6159

CMU SCS Database Architectures The goal is parallelize operations across multiple resources. –CPU –Memory –Network –Disk Faloutsos/PavloCMU SCS /61510

CMU SCS Database Architectures Faloutsos/PavloCMU SCS /61511 Shared Nothing Network Shared Memory Network Shared Disk Network

CMU SCS Shared Memory CPUs and disks have access to common memory via a fast interconnect. –Very efficient to send messages between processors. –Sometimes called “shared everything” Faloutsos/PavloCMU SCS /61512 Examples: All single-node DBMSs.

CMU SCS Shared Disk All CPUs can access all disks directly via an interconnect but each have their own private memories. –Easy fault tolerance. –Easy consistency since there is a single copy of DB. Faloutsos/PavloCMU SCS /61513 Examples: Oracle Exadata, ScaleDB.

CMU SCS Shared Nothing Each DBMS instance has its own CPU, memory, and disk. Nodes only communicate with each other via network. –Easy to increase capacity. –Hard to ensure consistency. Faloutsos/PavloCMU SCS /61514 Examples: Vertica, Parallel DB2, MongoDB.

CMU SCS Early Systems MUFFIN – UC Berkeley (1979) SDD-1 – CCA (1980) System R* – IBM Research (1984) Gamma – Univ. of Wisconsin (1986) NonStop SQL – Tandem (1987) BernsteinMohanDeWittGray Stonebraker

CMU SCS Inter- vs. Intra-query Parallelism Inter-Query: Different queries or txns are executed concurrently. –Increases throughput & reduces latency. –Already discussed for shared-memory DBMSs. Intra-Query: Execute the operations of a single query in parallel. –Decreases latency for long-running queries. Faloutsos/PavloCMU SCS /61516

CMU SCS Parallel/Distributed DBMSs Advantages: –Data sharing. –Reliability and availability. –Speed up of query processing. Disadvantages: –May increase processing overhead. –Harder to ensure ACID guarantees. –More database design issues. Faloutsos/PavloCMU SCS /61517

CMU SCS Today’s Class Overview & Background Design Issues Distributed OLTP Distributed OLAP Faloutsos/PavloCMU SCS /61518

CMU SCS Design Issues How do we store data across nodes? How does the application find data? How to execute queries on distributed data? –Push query to data. –Pull data to query. How does the DBMS ensure correctness? Faloutsos/PavloCMU SCS /61519

CMU SCS Database Partitioning Split database across multiple resources: –Disks, nodes, processors. –Sometimes called “sharding” The DBMS executes query fragments on each partition and then combines the results to produce a single answer. Faloutsos/PavloCMU SCS /61520

CMU SCS Naïve Table Partitioning Each node stores one and only table. Assumes that each node has enough storage space for a table. Faloutsos/PavloCMU SCS /61521

CMU SCS Naïve Table Partitioning Faloutsos/PavloCMU SCS /61522 Table1 Partitions Tuple1 Tuple2 Tuple3 Tuple4 Tuple5 SELECT * FROM table Ideal Query: Table2 Table1

CMU SCS Horizontal Partitioning Split a table’s tuples into disjoint subsets. –Choose column(s) that divides the database equally in terms of size, load, or usage. –Each tuple contains all of its columns. Three main approaches: –Round-robin Partitioning. –Hash Partitioning. –Range Partitioning. Faloutsos/PavloCMU SCS /61523

CMU SCS Horizontal Partitioning Faloutsos/PavloCMU SCS /61524 P2P4P5P3P1 TablePartitions Tuple1 Tuple2 Tuple3 Tuple4 Tuple5 SELECT * FROM table WHERE partitionKey = ? SELECT * FROM table WHERE partitionKey = ? Ideal Query: Partitioning Key

CMU SCS Vertical Partitioning Split the columns of tuples into fragments: –Each fragment contains all of the tuples’ values for column(s). Need to include primary key or unique record id with each partition to ensure that the original tuple can be reconstructed. Faloutsos/PavloCMU SCS /61525

CMU SCS Vertical Partitioning Faloutsos/PavloCMU SCS /61526 P2P4P5P3P1 TablePartitions Tuple1 Tuple2 Tuple3 Tuple4 Tuple5 SELECT column FROM table Ideal Query:

CMU SCS Replication Partition Replication: Store a copy of an entire partition in multiple locations. –Master – Slave Replication Table Replication: Store an entire copy of a table in each partition. –Usually small, read-only tables. The DBMS ensures that updates are propagated to all replicas in either case. Faloutsos/PavloCMU SCS /61527

CMU SCS Replication Faloutsos/PavloCMU SCS /61528 Partition Replication P1 P2 Master Slave Master Table Replication P1 P2 Table Node 1 Node 2

CMU SCS Data Transparency Users should not be required to know where data is physically located, how tables are partitioned or replicated. A SQL query that works on a single-node DBMS should work the same on a distributed DBMS. Faloutsos/PavloCMU SCS /61529

CMU SCS OLTP vs. OLAP On-line Transaction Processing: –Short-lived txns. –Small footprint. –Repetitive operations. On-line Analytical Processing: –Long running queries. –Complex joins. –Exploratory queries. Faloutsos/PavloCMU SCS /61530

CMU SCS Workload Characterization WritesReads Simple Complex Workload Focus Operation Complexity OLTP OLAP Michael Stonebraker – “Ten Rules For Scalable Performance In Simple Operation' Datastores” Social Networks

CMU SCS Today’s Class Overview & Background Design Issues Distributed OLTP Distributed OLAP Faloutsos/PavloCMU SCS /61532

CMU SCS Distributed OLTP Execute txns on a distributed DBMS. Used for user-facing applications: –Example: Credit card processing. Key Challenges: –Consistency –Availability Faloutsos/PavloCMU SCS /61533

CMU SCS Single-Node vs. Distributed Transactions Single-node txns do not require the DBMS to coordinate behavior between nodes. Distributed txns are any txn that involves more than one node. –Requires expensive coordination. Faloutsos/PavloCMU SCS /61534

CMU SCS Simple Example Faloutsos/Pavlo Application Server Node 1Node 2 Begin Commit Execute Queries

CMU SCS Transaction Coordination Assuming that our DBMS supports multi- operation txns, we need some way to coordinate their execution in the system. Two different approaches: –Centralized: Global “traffic cop”. –Decentralized: Nodes organize themselves. Faloutsos/PavloCMU SCS /61536

CMU SCS TP Monitors Example of a centralized coordinator. Originally developed in the s to provide txns between terminals + mainframe databases. –Examples: ATMs, Airline Reservations. Many DBMSs now support the same functionality internally. Faloutsos/PavloCMU SCS /61537

CMU SCS Centralized Coordinator Faloutsos/PavloCMU SCS /61538 P2P4P5P3P1 Repl Partitions Application Server Coordinator Lock Request Acknowledgement Commit Request Safe to commit?

CMU SCS Centralized Coordinator Faloutsos/PavloCMU SCS /61539 P2P4P5P3P1 Repl Partitions Application Server Middleware Query Requests Safe to commit?

CMU SCS Decentralized Coordinator Faloutsos/PavloCMU SCS /61540 P2P4P5P3P1 ITEM Partitions Application Server Commit Request Safe to commit?

CMU SCS Observation Q: How do we ensure that all nodes agree to commit a txn? –What happens if a node fails? –What happens if our messages show up late? Faloutsos/PavloCMU SCS /61541

CMU SCS CAP Theorem Proposed by Eric Brewer that it is impossible for a distributed system to always be: –Consistent –Always Available –Network Partition Tolerant Proved in Faloutsos/PavloCMU SCS /61542 Brewer Pick Two!

CMU SCS CAP Theorem Faloutsos/PavloCMU SCS /61543 AC P Consistency Availability Partition Tolerant Linearizability All up nodes can satisfy all requests. Still operate correctly despite message loss. No Man’s Land

CMU SCS CAP – Consistency Faloutsos/PavloCMU SCS /61544 Node 1Node 2 NETWORK Application Server Set A=2, B=9 Application Server A=1 B=8 A=2 B=9 Read A,B A=2B=9 A=1 B=8 A=2 B=9 Must see both changes or no changes MasterReplica

CMU SCS CAP – Availability Faloutsos/PavloCMU SCS /61545 Node 1Node 2 NETWORK Application Server Read B Application Server Read A B=8 A=1 B=8 A=1 B=8 X A=1

CMU SCS CAP – Partition Tolerance Faloutsos/PavloCMU SCS /61546 Node 1Node 2 NETWORK Application Server Set A=2, B=9 Application Server A=1 B=8 A=2 B=9 Set A=3, B=6 A=1 B=8 A=3 B=6 X Master

CMU SCS CAP Theorem Relational DBMSs: CA/CP –Examples: IBM DB2, MySQL Cluster, VoltDB NoSQL DBMSs: AP –Examples: Cassandra, Riak, DynamoDB Faloutsos/PavloCMU SCS /61547 These are essentially the same!

CMU SCS Atomic Commit Protocol When a multi-node txn finishes, the DBMS needs to ask all of the nodes involved whether it is safe to commit. –All nodes must agree on the outcome Examples: –Two-Phase Commit –Three-Phase Commit –Paxos Faloutsos/PavloCMU SCS /61548

CMU SCS Two-Phase Commit Faloutsos/PavloCMU SCS /61549 Node 1Node 2 Application Server Commit Request Node 3 OK Phase1: Prepare Phase2: Commit Participant Coordinator

CMU SCS Two-Phase Commit Faloutsos/PavloCMU SCS /61550 Node 1Node 2 Application Server Commit Request Node 3 OK ABORT OK Phase1: Prepare Phase2: Abort

CMU SCS Two-Phase Commit Each node has to record the outcome of each phase in a stable storage log. Q: What happens if coordinator crashes? –Participants have to decide what to do. Q: What happens if participant crashes? –Coordinator assumes that it responded with an abort if it hasn’t sent an acknowledgement yet. The nodes have to block until they can figure out the correct action to take. Faloutsos/PavloCMU SCS /61551

CMU SCS Three-Phase Commit The coordinator first tells other nodes that it intends to commit the txn. If the coordinator fails, then the participants elect a new coordinator and finish commit. Nodes do not have to block if there are no network partitions. Faloutsos/PavloCMU SCS /61552 Failure doesn’t always mean a hard crash.

CMU SCS Paxos Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. –First correct protocol that was provably resilient in the face asynchronous networks Faloutsos/PavloCMU SCS /61553

CMU SCS 2PC vs. Paxos Two-Phase Commit: blocks if coordinator fails after the prepare message is sent, until coordinator recovers. Paxos: non-blocking as long as a majority participants are alive, provided there is a sufficiently long period without further failures. Faloutsos/PavloCMU SCS /61554

CMU SCS Distributed Concurrency Control Need to allow multiple txns to execute simultaneously across multiple nodes. –Many of the same protocols from single-node DBMSs can be adapted. This is harder because of: –Replication. –Network Communication Overhead. –Node Failures. Faloutsos/PavloCMU SCS /61555

CMU SCS Distributed 2PL Faloutsos/Pavlo56 Node 1Node 2 NETWORK Application Server Set A=2, B=9 Application Server A=1 Set A=0, B=7 B=8

CMU SCS Recovery Q: What do we do if a node crashes in CA/CP DBMS? If node is replicated, use Paxos to elect a new primary. –If node is last replica, halt the DBMS. Node can recover from checkpoints + logs and then catch up with primary. Faloutsos/PavloCMU SCS /61557

CMU SCS Today’s Class Overview & Background Design Issues Distributed OLTP Distributed OLAP Faloutsos/PavloCMU SCS /61558

CMU SCS Distributed OLAP Execute analytical queries that examine large portions of the database. Used for back-end data warehouses: –Example: Data mining Key Challenges: –Data movement. –Query planning. Faloutsos/PavloCMU SCS /61559

CMU SCS Distributed OLAP Faloutsos/PavloCMU SCS /61560 P2P4P5P3P1 ITEM Partitions Application Server Single Complex Query

CMU SCS Distributed Joins Are Hard Assume tables are horizontally partitioned: –Table1 Partition Key → table1.key –Table2 Partition Key → table2.key Q: How to execute? Naïve solution is to send all partitions to a single node and compute join. Faloutsos/PavloCMU SCS /61561 SELECT * FROM table1, table2 WHERE table1.val = table2.val SELECT * FROM table1, table2 WHERE table1.val = table2.val

CMU SCS Semi-Joins Main Idea: First distribute the join attributes between nodes and then recreate the full tuples in the final output. –Send just enough data from each table to compute which rows to include in output. Lots of choices make this problem hard: –What to materialize? –Which table to send? Faloutsos/PavloCMU SCS /61562

CMU SCS Summary Everything is harder in a distributed setting: –Concurrency Control –Query Execution –Recovery Faloutsos/PavloCMU SCS /61563

CMU SCS Next Class Discuss distributed OLAP more. –You’ll learn why MapReduce was a bad idea. Compare OldSQL vs. NoSQL vs. NewSQL Real-world Systems Faloutsos/PavloCMU SCS /61564