The Birds and Bees Talk (aka Replication) Sushanth 22 nd April 2015, Hive Contributor’s Meetup.

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Chapter 16: Recovery System
Transaction Program unit that accesses the database
Transactions and Recovery Checkpointing Souhad Daraghma.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Recovery CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
How to corrupt your data by accident BY: LLOYD ALBIN 9/3/2013.
Recovery from Crashes. Transactions A process that reads or modifies the DB is called a transaction. It is a unit of execution of database operations.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Recovery from Crashes. ACID A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Recovery 10/18/05. Implementing atomicity Note, when a transaction commits, the portion of the system implementing durability ensures the transaction’s.
CS 582 / CMPE 481 Distributed Systems
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
Transactional Services Ricardo Jiménez-Peris Marta Patiño-Martínez Technical University of Madrid 1 st Adapt Workshop 23 rd -24 th September 2002 Madrid,
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Chapter 1 Introduction to Databases
Building Highly Available Systems with SQL Server™ 2005 Robert Rea Brandon Consulting.
Building Highly Available Systems with SQL Server™ 2005 Vineet Gupta Evangelist – Data and Integration Microsoft Corp.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
1 Oracle Database 11g – Flashback Data Archive. 2 Data History and Retention Data retention and change control requirements are growing Regulatory oversight.
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
Command. RHS – SOC 2 Executing a command Executing a command appears simple at first, but many details to consider: –Who creates a command? –Who invokes.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
HANDLING FAILURES. Warning This is a first draft I welcome your corrections.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Triggers A Quick Reference and Summary BIT 275. Triggers SQL code permits you to access only one table for an INSERT, UPDATE, or DELETE statement. The.
Concurrency and Transaction Processing. Concurrency models 1. Pessimistic –avoids conflicts by acquiring locks on data that is being read, so no other.
Replicated Databases. Reading Textbook: Ch.13 Textbook: Ch.13 FarkasCSCE Spring
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
1 Memory Management Chapter 7. 2 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated to ensure a reasonable.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
SQL Server 2005 Implementation and Maintenance Chapter 12: Achieving High Availability Through Replication.
IT Database Administration Section 09. Backup and Recovery Backup: The available options Full Consistent (cold) Backup Database shutdown, all files.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
Recovery technique. Recovery concept Recovery from transactions failure mean data restored to the most recent consistent state just before the time of.
Transactional Recovery and Checkpoints. Difference How is this different from schedule recovery? It is the details to implementing schedule recovery –It.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Database Recovery Techniques
Database Administration
Transactional Recovery and Checkpoints
Antonio Abalos Castillo
MongoDB Distributed Write and Read
Cassandra Transaction Processing
Lecturer : Dr. Pavle Mogin
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Introduction to NewSQL
MDEV Cross-Engine SQL Backup
Let Me Finish... Isolating Write Operations
CSE 451: Operating Systems Spring 2011 Journaling File Systems
Printed on Monday, December 31, 2018 at 2:03 PM.
Recovery System.
Let Me Finish... Isolating Write Operations
Database Recovery 1 Purpose of Database Recovery
Distributed Database Management Systems
CSE 451: Operating Systems Spring 2010 Module 14
Presentation transcript:

The Birds and Bees Talk (aka Replication) Sushanth 22 nd April 2015, Hive Contributor’s Meetup

Need : Disaster recovery Solution : Replication, enabled through data management tools like Falcon –Nice to have : Load balancing (but not our initial thrust / reason we do this.) Idea : We already have export & import, and we already have notifications, so it should be a simple matter to extend them to implement replication, right? Page 2 Replication

Every transaction that makes a change writes out a redo log item that contains the change that happened. This delta is replayed on all the replicas to make them a faithful copy of the original. We add replication capability to hive, but leave the actual execution to data management tools like Falcon. Page 3 Replication : Basic Idea

Approaches not considered: –Feed-definition-based Other tools like Falcon already support this Complex config, more in data management scope –HDFS+Metadata repl Multiple sources of truth headaches Partial replication issues Page 4 Before we go on..

Transaction location : WHERE? –primary copy write : M-S models simple concurrency control not great for non-read load-balancing because primary is still a chokepoint –update anywhere : M-M models good load balancing solution complex concurrency control Page 5 Taxonomy : Transaction source

Synchronization point : WHEN? –eager strong consistency can have performance impact, longer response times –lazy fast, optimizable can have stale data, temporary inconsistency Page 6 Taxonomy : Synchronization Strategy

Great for load-balancing, but make concurrency control very difficult to achieve. "Good enough" if we achieve read-load- balancing for analytics loads. –If need be, we can allow replication isolation T1 can replicate from A->B => T1 cannot be replicated from B->A, but does not prevent T2 being replicated from B->A We will ignore these for now until we solidify other aspects. Page 7 M-M models

Eager => Log is guaranteed written for every update that happens on the primary Lazy => Redo log is generated from the primary copy by a separate background daemon, done asynchronously Page 8 Synchronization

Log & Primary Copy Write approach – Write the log & the primary copy before returning successfully from a transaction Costly in terms of disk & time Synchronous, so blocking Page 9 Eager Synchronization

Write out an in-place edit log first, and return successfully when that gets written, manage compaction semantics and versioning semantics for readers (similar to what's implemented for ACID on hive) –Good on performance, effectively asynchronous –Increased complexity –What happens if we return success to the user upon writing the delta, and then find that it fails on log-apply/compaction? –Not compatible with our notion of external tables - dependent on external writer to perform operations in a required manner, rather than traditionally used approaches of simply writing in data into the directory and registering new data with hive. Page 10 Eager Synchronization

Redo Logs generated asynchronously, typically by background daemons/tools Lazy-Redo Concurrent copy Approach: –Primary copy returns immediately, but needs to be able to provide snapshots or MVCC to make sure that this data does not change yet again during the time the redo log is being generated, keeps older copy around. Complex to implement What about DROP TABLE/etc ? External tables and external management make this more difficult. Page 11 Lazy Synchronization

Robust-repeatable-replay of lazy logs (yes, I’m going for alliteration here : help me out here by suggesting more things I can add) –No guarantees that every single event translates to a delta => No need to keep versions at source. –Captures latest version at source for every event –Destination-apply needs to be robust and apply if the delta being applied is newer in “state” than the destination. –Can be temporarily out-of-synch, but will keep rubber-banding to latest state. Bad for load balancing usecases Great for DR Stretches definition of “eventual” in eventual consistency. Page 12 Lazy Synchronization

Event log for all operations on source warehouse {Create,Alter,Drop}{Database,Table,Partition} Export -> copy -> Import : –Generify : Commands on source wh, copy if needed, Commands on dest wh Translate each event to a ReplicationTask –(pluggable ReplicationTaskFactory allows third party implementation plugins) Third-party tool, like Falcon, consumes and executes ReplicationTasks. Page 13 Back to basics

Creates => Export/Import if import is newer than dest object Alters => Export metadata/Import metadata if import is newer than dest object Drop = > Drop if event that spawned the drop is newer than dest object. Page 14 Brass Tacks

EXPORT [FOR [METADATA] REPLICATION(…)] DROP TABLE … [FOR REPLICATION(‘eventid’)] ALTER TABLE DROP PARTITION … [FOR REPLICATION(‘eventid’)] Page 15 DDL Additions

Planned: –Event nullification/collapsing –Views, Functions, Roles replication –ACID interop –HDFS based eventlog rather than metastore-based one. Maybe?: –Single EXPORT/IMPORT command for database level –Export-to-queue & Import-from-queue semantics & kafka interop? –Other repl implementations like Hive->MySQL/Oracle/? & MySQL/Oracle/?->Hive Page 16 Future Work

Umbrella jira : HIVE-7973 Documentation tracking : HIVE evelopment (work-in-progress) Kemme, Bettina, Ricardo Jiménez-Peris, Marta Patiño-Martínez, and Gustavo Alonso. "12.2 Basic Taxonomy for Replica Control Approaches." Replication Theory and Practice. By Bernadette Charron-Bost, Fernando Pedone, and André Schiper. Berlin: Springer-Verlag, N. pag. Print. References Page 17