Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quiz 2 Review.

Similar presentations


Presentation on theme: "Quiz 2 Review."— Presentation transcript:

1 Quiz 2 Review

2 Key Concepts ACID Transactions Serializability Concurrency control
Two-phase locking Optimistic concurrency control Recovery Eventual consistency / dynamo Distributed databases Main memory dbs

3 View Serializability A particular ordering of instructions in a schedule S is view equivalent to a serial ordering S' iff: Every value read in S is the same value that was read by the same read in S'. The final write of every object is done by the same transaction T in S and S'

4 Conflict Serializability
A schedule is conflict serializable if it is possible to swap non-conflicting operations to derive a serial schedule. For all pairs of conflicting operations {O1 in T1, O2 in T2} either O1 always precedes O2, or O2 always precedes O1.

5 Precedence Graph Given transactions Ti and Tj,
Create an edge from TiTj if: Ti reads/writes some A before Tj writes A, or Ti writes some A before Tj reads A If there are cycles in this graph, schedule is not conflict serializable

6 Two-Phase Locking Transaction steps:
Lock acquisition (growing) Lock release (shrinking) Locks may be shared (read), exclusive (write), or upgraded (was read now write) Phantom problem arises when locking does not reflect granularity of data access E.g., T1 inserts one tuple at a time while T2 scans whole table repeatedly

7 Deadlock Detection Determine if two or more xactions are waiting for one another Usually solved by ordering lock acquisition Only works if we know all locks needed up front Source: Examples:

8 Study Break: Locking Consider the following transactions:
T1: read(A),write(A), release(A), read(B), write(B) release(B) T2: read(B), write(B), release(B), read(A), write(A), release(A) Draw a schedule that: Uses 2PL and both transactions succeed Has a deadlock

9 Study Break Solution Successful: Deadlocked:
T1.RA, T1.WA, T1.RB, T1.WB, T1.rel(A), T1.rel(B), T2.RB, T2.WB, T2.RA, T2.WA, T2.rel(B), T2.rel(A) Deadlocked: T1.RA, T1.WA, T2.RB, T2.WB, T1.RB, T1.WB (deadlocks)

10 Optimistic Concurrency Control
Don’t wait for locks Check for serializability after the fact 3 phases: read, verify, write Parallel vs serialized verification

11 Recovery Goal: losers undone, winners redone Write ahead logging Log all writes, plus xaction start / end STEAL/FORCE ARIES: 3 passes ANALYSIS: determine where to start replay from REDO: “repeat history” UNDO: remove effects of losers “physiological logging” dirty page table (DPT) / transaction table (XT) cheap checkpoints: just flush DPT/XT dirty pages flushed (not logged) CLRs for recovering during crashes in recovery

12 Study Break: Recovery In an ARIES-based logging system, how could you simplify the system if you knew that the database never wrote dirty pages from uncommited transactions to disk? Suppose you are designing a database for processing thousands of small transactions per second. Your system uses a multi-threaded design, with one thread per transaction, employs strict two-phase locking for concurrency control, and does write-ahead logging with a FORCE of the log tail before each commit for recovery. It’s single disk is used for both the heap pages and the log. You find that your system can only can only process about 100 transactions per second, and observe that the average RAM utilization and CPU load of the system while processing transactions is very low. Assuming you cannot add new hardware, what would you recommend as a way to improve the throughput of the system? Strict 2PL = release all locks only at transaction’s end

13 Recovery Solutions This implements !STEAL semantics, so no undo phase is necessary in recovery. Therefore, we do not need a before image for logging. This issue is caused by either deadlocks or flushing pages to disk at the end of each transaction. Can improve with deterministic ordering of batches of transactions or grouping together commits. Deterministic ordering a la the dining philosophers

14 Main Memory Databases Parallelize transaction processing where whole db fits into collective memory (over all nodes) Maintains ACID properties Locking makes distributed transactions slow Use replication to address durability

15 Readings: Dynamo Dynamo & Eventual Consistency
CAP theorem & ACID vs BASE Uses consistent hashing & replication Multiple versions of each data item Vector clocks to track changes Configurable numbers of replicas/readers/writers (N/R/W) to tune consistency guarantees Eventual consistency: all replicas eventually agree on state of a time BASE = basically available soft-state eventually consistent Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it succeeded or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

16 Study Break: CAP Theorem
For each of the protocols below, determine which two parts of CAP it implements: A main memory database that uses optimistic concurrency control A sloppy quorum where several replicas agree on whether a write succeeds or fails and a single host notifies the user of their write’s status A distributed database that uses 2PL and uses deadlock detection for long-running transactions CA AP CP

17 Parallel / Distributed Databases
Shared memory vs shared nothing Goal: linear speed up (not always possible) Horizontal partitioning Hash/range/round robin Parallel query processing Filter / projection applied locally Joins Proper partitioning: done locally Improper partitioning: requires repartitioning, or replication of one table on all nodes Group BYs “Pre-aggregation” – compute partial aggregates locally (SUMs, COUNTs, etc), and then merge

18 Study Break: Distributed Query Planning
Using split and merge, create a distributed plan for each of the following queries: SELECT count(*) FROM students; Use two nodes in your solutions. Presume Node 1 coordinates the output. SELECT * FROM employee, office WHERE office.name='Tech' AND employee.oid = office.oid; Presume no tables are partitioned on oid

19 Study Break Solution


Download ppt "Quiz 2 Review."

Similar presentations


Ads by Google