We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byBrice Talbot
Modified about 1 year ago
© 2014 Zvi M. Kedem 1 Unit 12 Advanced Concepts
© 2014 Zvi M. Kedem 2 Characteristics of Some Applications uA typical application: security trading system uFast response uFault tolerance: continue operating uFast application development uCorrectness less important for decision making (not for execution): note conceptual similarity to OLAP uRun on clusters of machines, so really a distributed database + trading algorithms uDo not use relational databases: too heavy weight uWe will look at some concepts of distributed databases
© 2014 Zvi M. Kedem 3 Distributing The Data
© 2014 Zvi M. Kedem 4 Collection of Machines Each Running a DBMS uEach machine runs some DBMS, not necessarily a relational database system uBut each has some version of Physical Implementation: file system, indexes, … Query Processor Recovery Mechanism Concurrency Mechanism uThe new issue: coordinate the concurrent execution of several machines
© 2014 Zvi M. Kedem 5 Issues to Revisit uACID properties uQuery execution planning uWe will talk very briefly about Recovery Concurrency Query execution planning
© 2014 Zvi M. Kedem 6 Recovery
© 2014 Zvi M. Kedem 7 Global Recovery uWe have a local recovery manager on each machine uIt is able to guarantee A: Atomicity C: Consistency D: Durability for transactions executing on its own machine uWe need to guarantee ACD for transactions that run on more than one machine uSo for example, such a transaction must be either committed or aborted globally, that is the work on each machine must be either committed or aborted (rolled back)
© 2014 Zvi M. Kedem 8 Our Old Example: Money Transfer uItems a and b are stored on a disk attached to some machine running a DBMS uTransfer $5 from account a to b 1. transaction starts 2. read a into xa (local variable in RAM) 3. xa := xa − 5 4. write xa onto a 5. read b into xb (local variable in RAM) 6. xb := xb write xb onto b 8. transaction ends uIf initial values are a = 8 and b = 1 then after the execution a = 3 and b = 6
© 2014 Zvi M. Kedem 9 Old Example: New Scenario uThere are 3 DBMS machines: nodes in a cluster uThere is M1 that is the coordinator uThere is M2 that is a participant uThere is M3 that is a participant uUser interacts with M1 uM2 stores a on its local disk uM3 stores b on its local disk
© 2014 Zvi M. Kedem 10 Our New Example: Money Transfer uUser asks to transfer $5 from account a to b uM1 will be the coordinator uM2 + M3 will be the participants uVery rough sketch of execution 1. M1 starts a global transaction 2. M1 tells M2 to subtract 5 from a 3. M1 tells M3 to add 5 to b 4. M2 starts a local transaction to subtract 5 from a 5. M3 starts a local transaction to add 5 to b 6. M1 + M2 + M3 cooperate so “everything” is atomically committed or aborted: all transactions commit or abort
© 2014 Zvi M. Kedem 11 Two-Phase Commit Protocol General Flowchart (Simplified)
© 2014 Zvi M. Kedem 12 Two-Phase Commit Protocol All Commit
© 2014 Zvi M. Kedem 13 Two-Phase Commit Protocol A Participant Aborts All Abort
© 2014 Zvi M. Kedem 14 Two-Phase Commit Protocol A Participant Not Ready All Abort
© 2014 Zvi M. Kedem 15 Two-Phase Commit Protocol Some (Other) Participant Not Ready All Abort
© 2014 Zvi M. Kedem 16 Two-Phase Commit Protocol Coordinator Decides: Global Commit
© 2014 Zvi M. Kedem 17 Two-Phase Commit Protocol A Participant Is Uncertain It Must Wait
© 2014 Zvi M. Kedem 18 Two-Phase Commit Many Optimizations Possible uA participant can report it is ready on its own initiative uA participant can report that it must abort on its own initiative uIf a participant crashes while uncertain it can ask other participants if they know what the decision was u…
© 2014 Zvi M. Kedem 19 Another Issue: Global Deadlock Handling uAssume a system with strict two-phase locking (locked held until after commit) uThe system uses two-phase commit uM1 “spawned” two transactions T[1,1] executing at site S1 T[1,2] executing at site S2 uOnly after global commit of M1, T[1,1], T[1,2] can their locks be released uOnly after global commit of M2, T[2,1], T[2,2] can their locks be released uM2 “spawned” two transactions T[2,1] executing at site S1 T[2,2] executing at site S2 uS1 contains items a and b uS2 contains items c and d
© 2014 Zvi M. Kedem 20 Another Issue: Global Deadlock Handling S1S2 T[1,1] locks a T[2,1] locks b T[1,1] waits to lock b T[1,2] locks c T[2,2] locks d T[2,2] waits to lock c uFor T[1,1] to continue, T[2,1] has to release a lock uCan only happen after M2, T[2,1], T[2,2] committed uFor T[2,2] to continue, T[1,2] has to release a lock uCan only happen after M1, T[1,1], T[1,2] committed
© 2014 Zvi M. Kedem 21 Another Issue: Global Deadlock Handling uWe have a global deadlock uThere is no local deadlock anywhere uDifficult to detect
© 2014 Zvi M. Kedem 22 Concurrency
© 2014 Zvi M. Kedem 23 Global Concurrency Management uWe assume that know how to manage recovery, that is a distributed transaction either commits or aborts at all sites on which it executes uACD is guaranteed uWe need to guarantee I (Isolation) also for transactions that run on more than one machine uEach machine is running a local concurrency manager, which we assume operates using rigorous locking uAll locks are held until after local commit or abort on each machine uIn case of global commit, all the locks are held until after global commit decision: the coordinator writes commit record on its log uThis guarantees global serializability
© 2014 Zvi M. Kedem 24 Extension to Multiple Copies (Replication) One Machine vs. Two Machines
© 2014 Zvi M. Kedem 25 Advantages of Data Replication uIt may be useful to replicate some data uTo improve fault-tolerance If Machine 1 crashes, we can still access “the blue data” on Machine 2 uTo improve efficiency Both Machine 1 and Machine 2 can access “the blue data” locally So they do not have to use the network to access that data and can access it fast
© 2014 Zvi M. Kedem 26 Problems With Data Replication uWe need to keep the replicated data consistent u“The blue data” has to be the same on Machine 1 and on Machine 2 uSo, if some transaction running on Machine 1 modifies “the blue data”, we must make sure that the same modification is made (preferably transparently by the system) to “the blue data” on Machine 2 uSo perhaps we could use the following protocol If a transaction wants to modify “the blue data” on one machine, we must make sure transparently that it is modified in the same way on both machines A transaction wants to read “the blue data”, it can read it from any machine
© 2014 Zvi M. Kedem 27 A Nightmare Scenario: Network Partition uThe network partitions into two sets that cannot communicate with each other 1.Machine 1 2.Machine 2 and Machine 3 uNo transaction can modify “the blue data” uBecause if this is possible, it can only do it on one of the machines uThen “the blue data” is not consistent uA transaction that reads “the blue data” on Machine 1 will get a different results than a transaction that reads “the blue data” on Machine 2
© 2014 Zvi M. Kedem 28 Thomas Majority Rule (Example: Sufficient For Understanding) uThere is a data item X that is replicated on 5 machines, M1, M2, M3, M4, M5 uThe majority of these machines is 3 uThe data item is stored as a pair (X,T), where T is the timestamp it was last written, assuming the existence of a global clock known to everybody (easy to implement, e.g., atomic clock broadcasting on radio from Colorado) uTo write X, access a majority (at least 3) sites and replace the existing (X,T) with (Xnew,Tcurrent) uTo read X, access a majority (= 3) sites and, read the three pairs of (X,T). Find the one in which with T is the largest and return the corresponding X
© 2014 Zvi M. Kedem 29 Thomas Majority Rule (Example: Sufficiently General) uThe value of (X,T) in the majority of sites used will be red uInitial state in the 5 sites (10,0) (10,0) (10,0) (10,0) (10,0) (10,0) uMajority used to write 20 into X at time 1: M1, M2, M3 (20,1) (20,1) (20,1) (10,0) (10,0) uMajority used to write 30 into X at time 3: M2, M3, M4 (20,1) (30,3) (30,3) (30,3) (10,0) uMajority used to read X at time 6: M3, M4, M5 Retrieved: (30,3) (30,3) (10,0) uSince the largest timestamp is 3, the correct value for X is 30 uThe protocol works since any two sets of at least 3 machines contain at least one common machine with the latest timestamp
© 2014 Zvi M. Kedem 30 Thomas Majority Rule General Network Partitioning uMachines that are in a partition that does not include the majority of the copies cannot act on these copies Cannot read Cannot write uSo this does not solve the problem of “the blue data” as we always need to access both copies
© 2014 Zvi M. Kedem 31 Query Execution Planning
© 2014 Zvi M. Kedem 32 New Issue: Movement of Data uWe now have another cost to consider: moving data among machines uWe will look at one example where we will try just to decrease the cost of moving data uWe have two machines: M1 and M2 uIn M1 we have a relation R(A,B) uIn M2 we have a relation S(C,D) uAssume for simplicity that R and S are of the same size uWe want to compute SELECT A, C FROM R, S WHERE R.B = S.D; and have the result at M2
© 2014 Zvi M. Kedem 33 An Execution Plan A choice uCopy S to M1 uCompute the result uSend the result to M2 A better choice? uCopy R to M2 uCompute the result But if S is small and R large this may be better u Copy S to M1 uCompute the result uSend the result to M2
© 2014 Zvi M. Kedem 34 Even Better Execution Plan If The Parameters Are Right uOn M2 compute INSERT INTO TEMP1 SELECT DISTINCT D FROM S; uCopy TEMP1 to M1 uOn M1 compute INSERT INTO TEMP2 SELECT A, B FROM R, TEMP1 WHERE B = D; uCopy TEMP2 to M2 uOn M2 compute INSERT INTO ANSWER SELECT A, C FROM TEMP2, S WHERE B = D; uVery Good if TEMP1 and TEMP2 are relatively small
© 2014 Zvi M. Kedem 35 We Used a Semijoin uOut TEMP2 was left semijoin of R and S, that is the set of all the tuples of R for which there is a “matching” tuple in S (under the WHERE equality condition) uNotation: R S uSimilarly, we can define a right semijoin, denoted by
© 2014 Zvi M. Kedem 36 NoSQL Has To Compromise
© 2014 Zvi M. Kedem 37 CAP Theorem uWithout defining precisely, if we have more than one machine and replicate the data uYou can get only 2 of the following 3 properties 1.Consistency (you will always see a consistent state when accessing data) 2.Availability (if you can access a machine, it can read and write items it stores) 3.Partition Tolerance (you can work in the presence of partitions) uSo, to get A and P you may be willing to sacrifice C
© 2014 Zvi M. Kedem 38 Key Ideas uNoSQL databases and Distributed Database uTwo-phase commit uGlobal Deadlocks uConcurrency control with distributed data uQuery processing with distributed data uThe CAP theorem
Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity Log-Based Recovery Shadow Paging Recovery With Concurrent Transactions.
1 Consistency in Distributed Systems Recall the fundamental DS properties – DS may be large in scale and widely distributed 1.concurrent execution of components.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
G.Vadivu & P.Subha, Assistant Professor, SRM University, Kattankulathur 8/22/2011School of Computing, Department of IT.
1. Parallel Databases Introduction I/O Parallelism Interquery Parallelism Intraquery Parallelism Intraoperation Parallelism Interoperation Parallelism.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 26: Advanced.
CM20145 Recovery + Intro. to Concurrency Dr Alwyn Barry Dr Joanna Bryson.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Types of Distributed Database Systems Homogeneous All sites of the database system have.
Vipul Patel Ideas … Please …
CM20145 Transactions & Serializability Chris Middup.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Chapter 18: Database System Architectures Centralized Systems Client--Server Systems Parallel.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 20: Database System.
CM20145 Transactions Dr Alwyn Barry Dr Joanna Bryson.
CSL 771: Database Implementation Transaction Processing Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database.
IOS103 OPERATING SYSTEM VIRTUAL MEMORY. Objectives At the end of the course, the student should be able to: Define virtual memory; Discuss the demand.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 17: Database.
Silberschatz, Galvin, and Gagne Applied Operating System Concepts Module 14: Network Structures Background Motivation Topology Network Types.
Linked Lists. Please Read These slides are provided for the use of students enrolled in James Durbanos Data Structures class (CISC 220). They are the.
Distributed databases 1. 2 Outline introduction principles / objectives problems.
What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. Operating system goals: Execute.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 12: Query Processing.
Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered.
The Client/Server Database Environment CS263 Lecture 12.
Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.
Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Chapter 14 Query Optimization. ©Silberschatz, Korth and Sudarshan14.2Database System Concepts 3 rd Edition Chapter 14: Query Optimization Introduction.
SQL. What is SQL? DML DDL Partially declarative Based on the algebra via the tuple calculus, and therefore its core has an elegant set-theoretic foundations.
1 Database Concurrency Control and Recovery Pessimistic concurrency control Two-phase locking (2PL) and Strict 2PL Timestamp ordering (TSO) and Strict.
OS Organization Continued Andy Wang COP 5611 Advanced Operating Systems.
© 2016 SlidePlayer.com Inc. All rights reserved.