The End of an Architectural Era It’s Time for a Complete Rewrite.

Slides:



Advertisements
Similar presentations
Chen Zhang Hans De Sterck University of Waterloo
Advertisements

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
More About Transaction Management Chapter 10. Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Distributed databases
1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. –Because disk accesses are.
Transaction.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
Chapter 13 (Web): Distributed Databases
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
CONCURRENCY CONTROL SECTION 18.7 THE TREE PROTOCOL By : Saloni Tamotia (215)
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Overview Distributed vs. decentralized Why distributed databases
What is a Transaction? Logical unit of work
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Distributed Databases
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
H-Store: A Specialized Architecture for High-throughput OLTP Applications Evan Jones (MIT) Andrew Pavlo (Brown) 13 th Intl. Workshop on High Performance.
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Concurrency and Transaction Processing. Concurrency models 1. Pessimistic –avoids conflicts by acquiring locks on data that is being read, so no other.
C-Store: Concurrency Control and Recovery Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun. 5, 2009.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
1 IRU Concurrency, Reliability and Integrity issues Geoff Leese October 2007 updated August 2008, October 2009.
M1G Introduction to Database Development 2. Creating a Database.
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Database structure and space Management. Database Structure An ORACLE database has both a physical and logical structure. By separating physical and logical.
Distributed Databases DBMS Textbook, Chapter 22, Part II.
Databases Illuminated
Chapter 10 Distributed Database Management System
Database structure and space Management. Segments The level of logical database storage above an extent is called a segment. A segment is a set of extents.
SQL Server 2005 Implementation and Maintenance Chapter 12: Achieving High Availability Through Replication.
XA Transactions.
MBA 664 Database Management Systems Dave Salisbury ( )
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
SCALING AND PERFORMANCE CS 260 Database Systems. Overview  Increasing capacity  Database performance  Database indexes B+ Tree Index Bitmap Index 
Distributed DBMS, Query Processing and Optimization
Chapter 1 Database Access from Client Applications.
NOEA/IT - FEN: Databases/Transactions1 Transactions ACID Concurrency Control.
Ch 15 Data Sharing Myungchul Kim
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
SQL Basics Review Reviewing what we’ve learned so far…….
CS 540 Database Management Systems NoSQL & NewSQL Some slides due to Magda Balazinska 1.
CSCI5570 Large Scale Data Processing Systems
Practical Database Design and Tuning
CS 440 Database Management Systems
Introduction to NewSQL
CS 440 Database Management Systems
Outline Announcements Fault Tolerance.
Practical Database Design and Tuning
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
Introduction of Week 14 Return assignment 12-1
Distributed Databases
Presentation transcript:

The End of an Architectural Era It’s Time for a Complete Rewrite

In Summary Argues that RDBMSs as we know them were designed for a different world Ad hoc queries Interactive use Dumb clients Time for a change?

Design Considerations Database fits in main memory Transactions rarely wait  single thread (thread per core, where appropriate, sharing nothing) Dynamically scale by adding/removing nodes Replication for fault tolerance All replicas actively processing No need to redo, only temporary undo Self-tuning (“no knobs”)

Design Considerations Run application logic in the same process as the DBMS (stored-procedures) Use optimistic concurrency control methods Avoid commit protocols requiring a wait for other sites

H-Store Specify transaction classes in advance –(Class example: “delete all rows from Orders where customer = $(customer)”) Specify table definitions in advance Many processing/storage nodes, divided into replica groups Techniques to accelerate specific subsets of possible transactions

Replica Consistency Timestamp ordering Assigned locally Clocks “nearly in sync” (NTP) Wait a “small period of time” to avoid misordering transactions Makes reference to a “maximum delay,” which is unbounded on an Ethernet?

Optimistic Concurrency Transactions are short No local locking at all Design to avoid contention

Tree Schemas Customer Order Line

Constrained Tree Applications For each transaction, all queries refer to same entry in root or related rows Horizontally divide root table according to ranges or hash-ranges on primary key (not automated) Divide other tables such that rows are colocated with related rows in root table Site 1: Steve Steve’s Orders Steve’s Orders’ Lines Site 2: Dave Dave’s Orders Dave’s Orders’ Lines

Making non-trees faster Single-sited transactions similar to tree case Replicate read-only data Try to make applications “one-shot”: –No intra-transaction dataflow –No inter-site communication within transactions –Vertically partitioning tables to achieve this (not automated) –Enables decomposition of transactions into single- sited subplans. –Decompose and dispatch: no need for further communiction

Vertical Partitioning Example Managers Drivers Cars Transaction 1: For a given manager, find his cars and set their colour Transaction 2: For a given drivers, find his cars and mark them as sold Store the OWNED column of Cars with its associated worker Store the COLOUR column of Cars with its associated manager Store the primary key of Cars in both locations (it is read only)

One-Shot Example Good: “Find a manager and mark his cars as blue, then find and delete a given driver” –Second phrase doesn’t depend on first –Can decompose even though manager and driver may not be colocated Bad: “Get a given employee’s salary, and deduct that figure from a department’s budget” –Dependency –Many-to-many relationship –Where to store the departments?

Two-phase transactions Read things from many sites Maybe abort Write things to many sites Strongly two-phase: based on reads, can make a site-local abort decision No undo logs required

Sterile Transactions Transaction which may run arbitrarily interleaved with any other transaction and always produce the same final state Obviously no need for concurrency control However no guarantee that transaction’s commit/abort decision will be unaffected Need a vote

General Transactions: Basic Decompose to individual sites At a site, wait for the same “small period of time” Execute if there are no uncommitted earlier transactions pending, abort otherwise Need an undo log, as might abort later Wait for results from sites, issue next wave

General Transactions: Advanced If there are too many aborts Wait longer to determine if a plan belonging to an earlier transaction appears If still too many aborts Track complete read-set and write-set of each transaction, abort if strictly necessary.

Performance TPC-C benchmark Five transaction classes Able to make all classes one-shot Needed to vertically partition a table which never experiences inserts or deletes Able to make every transaction sterile  the “small delay” was removed for the test! 82x speedup over an undisclosed “very popular commercial RDBMS”