 Mainak Ghosh, Wenting Wang, Gopalakrishna Holla, Indranil Gupta.

Slides:

Advertisements

Similar presentations

2  Industry trends and challenges  Windows Server 2012: Beyond virtualization  Complete virtualization platform  Improved scalability and performance.

Advertisements

Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.

Towards Autonomic Adaptive Scaling of General Purpose Virtual Worlds Deploying a large-scale OpenSim grid using OpenStack cloud infrastructure and Chef.

Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.

A Fast Growing Market. Interesting New Players Lyzasoft.

NoSQL Databases: MongoDB vs Cassandra

Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears Yahoo! Research Presenter.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

Introspective Replica Management Yan Chen, Hakim Weatherspoon, and Dennis Geels Our project developed and evaluated a replica management algorithm suitable.

MS CLOUD DB - AZURE SQL DB Fault Tolerance by Subha Vasudevan Christina Burnett.

What is it? What kind of system need it?. Distributing system, cloud system etc.

Clustering of Web Content for Efficient Replication Yan Chen, Lili Qiu, Wei Chen, Luan Nguyen and Randy H. Katz {yanchen, wychen, luann,

Installing and Setting up mongoDB replica set PREPARED BY SUDHEER KONDLA SOLUTIONS ARCHITECT.

Wasef: Incorporating Metadata into NoSQL Storage Systems Ala’ Alkhaldi, Indranil Gupta, Vaijayanth Raghavan, Mainak Ghosh Department of Computer Science.

Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.

Distributed Data Stores and No SQL Databases S. Sudarshan IIT Bombay.

Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.

PCAP Project: Probabilistic CAP and Adaptive Key-value Stores

Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.

Software Engineer, #MongoDBDays.

AN INTRODUCTION TO NOSQL DATABASES Karol Rástočný, Eduard Kuric.

Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.

Jeffrey D. Ullman Stanford University. 2 Chunking Replication Distribution on Racks.

Network Aware Resource Allocation in Distributed Clouds.

Panagiotis Antonopoulos Microsoft Corp Ioannis Konstantinou National Technical University of Athens Dimitrios Tsoumakos.

Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.

Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.

High Throughput Computing on P2P Networks Carlos Pérez Miguel

Cassandra - A Decentralized Structured Storage System

Cloud Scale Performance & Diagnosability Comprehensive SDN Core Infrastructure Enhancements vRSS Remote Live Monitoring NIC Teaming Hyper-V Network.

A Survey of Distributed Task Schedulers Kei Takahashi (M1)

Papers on Storage Systems 1) Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud, SC ) Making Cloud Intermediate Data Fault-Tolerant,

Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.

Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,

Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.

MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.

A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.

Surviving Failures in Bandwidth Constrained Datacenters Authors: Peter Bodik Ishai Menache Mosharaf Chowdhury Pradeepkumar Mani David A.Maltz Ion Stoica.

Scale up Vs. Scale out in Cloud Storage and Graph Processing Systems

)1()1( Presenter: Noam Presman Advanced Topics in Storage Systems – Semester B 2013 Authors: A.Cidon, R.Stutsman, S.Rumble, S.Katti,

MongoDB: What, why, when. Solutions Architect, MongoDB Inc. Massimo Brignoli #mongodb.

Optimizing Live Migration of Virtual Machines across Wide Area Networks using Integrated Replication and Scheduling Sumit Kumar Bose, Unisys Scott Brock,

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.

Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT IT Monitoring WG Technology for Storage/Analysis 28 November 2011.

1 Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan and Russell Sears Yahoo! Research.

COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University

Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.

Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Jihui Yang CS525 Advanced Distributed System March 1, 2016.

VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cassandra Architecture.

BIG DATA/ Hadoop Interview Questions.

University of Illinois at Urbana-Champaign

Cassandra - A Decentralized Structured Storage System

CS122B: Projects in Databases and Web Applications Winter 2017

MongoDB Distributed Write and Read

NOSQL databases and Big Data Storage Systems

Capitalize on modern technology

Massively Parallel Cloud Data Storage Systems

Data Lifecycle Review and Outlook

آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95

Benchmarking Cloud Serving Systems with YCSB

Resource-Efficient and QoS-Aware Cluster Management

The Database World of Azure

Presentation transcript:

 Mainak Ghosh, Wenting Wang, Gopalakrishna Holla, Indranil Gupta

2 Predicted to become a $3.4B industry by 2018

 Problem: Changing database or table-level configuration parameters o Primary/Shard Key – MongoDB, Cassandra o Ring size - Cassandra  Challenge: Affects a lot of data at once  Motivation: o Initially DB configured based on guesses o Sys admin needs to play around with different parameters Seamlessly, and efficiently o Later, as workload changes and business/use case evolves: Change parameters, but do it in a live database 3

 Existing Solution: Create a new schema, export and re-import data  Why Bad? o Massive unavailability of data: every second of outage costs $1.1K at Amazon and $1.5K at Google o Reconfiguration change caused outage at Foursquare o Manual change of primary key at Google: took 2 years and involved 2 dozen teams  Need a solution that is automated, seamless and efficient 4

 Fast completion time  Minimize the amount of data transfer required  Highly available  CRUD operations should be answered as much as possible with little impact on latency  Network Aware  Reconfiguration should adapt to underlying network latencies 5

 Master Slave Replication  Range-based Partitioning  Flexibility in data assignment  MongoDB, RethinkDB, CouchDB, etc. 6



8 S1 S2 S3 KoKo 123 KnKn 248 KoKo 456 KnKn 163 KoKo 789 KnKn 957 Old Arrangement KoKo 416 KnKn 123 KoKo 285 KnKn 456 KoKo 937 KnKn 789 New Arrangement

9 S1 S2 S Lemma 1: The greedy algorithm is optimal in total network transfer volume

10 S1 S2 S3 KoKo 123 KnKn 248 KoKo 456 KnKn 163 KoKo 789 KnKn 957 Old Arrangement KoKo 416 KnKn 123 KoKo 285 KnKn 456 KoKo 937 KnKn 789 New Arrangement

11 S1 S2 S Greedy Hungarian Assignment



13 RS0 Primary RS1 Primary RS2 Primary Config Front End select * from table where user_id = 20 user_id = 20 RS1 select * from table where product_id = 20

14 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1)

15 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) 1. Runs Algorithm 2. Generates placement plan update table set price = 20 where user_id = 20

16 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) Iteration 1Iteration 2

17 RS0 Secondary RS0 Primary RS1 Secondary RS1 Primary RS2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) update table set price = 20 where product_id = 20 RS0 Primary RS1 Primary RS2 Primary RS 0 Secondary RS 1 Secondary RS 2 Secondary



 Assigns a socket per chunk per source-destination pair of communicating servers (Chunk-Based) 19 Inter Rack Latency Unequal Data Size

20

21 WFS strategy 30% better than naïve chunk- based scheme and 9% better than Orchestra [Chowdhury et al]

 Morphus chooses slaves for reconfiguration during first Isolation phase  In a geo-distributed setting, naïve choice can lead to bulk transfers over wide area network  Solution: Localize bulk transfer by choosing replicas in the same datacenter o Morphus extracts the datacenter information from replica’s metadata.  2x-3x improvement observed in experiments 22



 Dataset: Amazon Reviews [Snap].  Cluster: Emulab d710 nodes, 100 Mbps LAN switch and Google Cloud (n1-standard-4 VMs), 1 Gbps network  Workload: Custom generator similar to YCSB. Implements Uniform, Zipfian and Latest distribution for key access  Morphus: Implemented on top of MongoDB 24

25 Access Distribution Read Success Rate Write Success Rate Read Only99.9- Uniform Latest Zipf

26

27 Access Distribution Read Success Rate Write Success Rate Read Only99.9- Uniform Latest Zipf Morphus has a small impact on data availability

28 Hungarian performs well in both scenarios and should be preferred over Greedy and Random schemes

29 Sub-linear increase in reconfiguration time as data and cluster size increases

 Online schema change [Rae et al.]: Resultant availabilities smaller.  Live data migration: Attempted in databases [Albatross, ShuttleDB] and VMs [Bradford et al.]. Similar approach as ours. o Albatross and ShuttleDB ship whole state to a set of empty servers.  Reactive reconfiguration proposed by Squall [Elmore et al]: Fully available but takes longer. 30

 Morphus is a system which allows live reconfiguration of a sharded NoSQL database.  Morphus is implemented on top of MongoDB  Morphus uses network efficient algorithms for placing chunks while changing shard key  Morphus minimally affects data availability during reconfiguration  Morphus mitigates stragglers during data migration by optimizing for the underlying network latencies. 31

32 Morphus scales super-linearly with data size significant portion of which is spent in data migration

33 Increase in number of replicas improves Morphus performance

 Assigns a socket per chunk per source-destination pair of communicating servers (Chunk-Based)  What if the topology is:  Each shape represents a node in a single replica set. Circle is the front end. Intra-rack latency is 1ms while inter-rack is 2ms 34

35 Questions?

 Growing quickly o $3.4B industry by 2018  Fast reads and writes (much faster than MySQL and relational databases)  Many companies use them in running critical infrastructures o Google, Facebook, Yahoo, and many others  Many open source NoSQL databases o MongoDB, Cassandra, Riak, etc. 36

 Initially DB configured based on guesses  Sys admin needs to play around with different parameters o Seamlessly, and efficiently  Later, as workload changes and business/use case evolves: o Change parameters, but do it in a live database 37