在这里写上你的标题 副标题文字副标题文字 作者名字 / 日期 Swift 杨雨 /2012-08-11.

Slides:



Advertisements
Similar presentations
ScaleDB Transactional Shared Disk storage engine for MySQL
Advertisements

Case Study: Photo.net March 20, What is photo.net? An online learning community for amateur and professional photographers 90,000 registered users.
Distributed Data Processing
CloudStack Scalability Testing, Development, Results, and Futures Anthony Xu Apache CloudStack contributor.
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
OpenStack Open Source Cloud Software. OpenStack: The Mission "To produce the ubiquitous Open Source cloud computing platform that will meet the needs.
1 NETE4631 Cloud deployment models and migration Lecture Notes #4.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
Overview on ZHT 1.  General terms  Overview to NoSQL dabases and key-value stores  Introduction to ZHT  CS554 projects 2.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Object Naming & Content based Object Search 2/3/2003.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Presented By : Nitya Shankaran Ritika Sharma. Overview Motivation behind the project Need to know how your data can be stored efficiently in the cloud.
Implementing Failover Clustering with Hyper-V
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
Running Your Database in the Cloud Eran Levin VP R&D - Xeround.
ZHT A Fast, Reliable and Scalable Zero-hop Distributed Hash Table
IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems.
Cloud Storage: All your data belongs to us! Theo Benson This slide includes images from the Megastore and the Cassandra papers/conference slides.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Computer Measurement Group, India Reliable and Scalable Data Streaming in Multi-Hop Architecture Sudhir Sangra, BMC Software Lalit.
Opensource for Cloud Deployments – Risk – Reward – Reality
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Software Engineer, #MongoDBDays.
Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.
Module 12: Designing High Availability in Windows Server ® 2008.
July 2003 Sorrento: A Self-Organizing Distributed File System on Large-scale Clusters Hong Tang, Aziz Gulbeden and Tao Yang Department of Computer Science,
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Bigtable: A Distributed Storage System for Structured Data 1.
AZR308. Building distributed systems on an abstraction against commodity hardware at Internet scale, composed of multiple services. Distributed System.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
Module 3 Planning and Deploying Mailbox Services.
1 MONGODB: CH ADMIN CSSE 533 Week 4, Spring, 2015.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Databases Illuminated
IT Pro Day Windows Server 2012 Hyper-V – The next chapter Michel Luescher, Senior Consultant Microsoft Thomas Roettinger, Program Manager Microsoft.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
VMware vSphere Configuration and Management v6
Building micro-service based applications using Azure Service Fabric
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Background Computer System Architectures Computer System Software.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
IT Pro Day Windows Server 2012 Hyper-V – The next chapter Michel Luescher, Senior Consultant Microsoft Thomas Roettinger, Program Manager Microsoft.
BIG DATA/ Hadoop Interview Questions.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Course: Cluster, grid and cloud computing systems Course author: Prof
CSCI5570 Large Scale Data Processing Systems
NOSQL.
Replication Middleware for Cloud Based Storage Service
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Cloud computing mechanisms
Cloud Computing Architecture
Specialized Cloud Architectures
ZORAN BARAC DATA ARCHITECT at CIN7
Presentation transcript:

在这里写上你的标题 副标题文字副标题文字 作者名字 / 日期 Swift 杨雨 /

写上你的文字你的文字 Content 1 Principles and Architecture 2 The Practice of 3 Problems and Imporvemetns

写上你的文字你的文字 Storage Types TypesProtocolApplication Block StorageSATA, SCSI, iSCSISAN, NAS, EBS File StorageExt3/4, XFS, NTFSPC, Servers, NFS Object StorageHTTP, RESTAmazon S3, Google Cloud Storage, Rackspace Cloud Files Specific StorageSpecific protocol based on tcp MySQL, MongoDB, HDFS

写上你的文字你的文字 Storage Types From:

Targets High Reliability = High Durability + High Availability High Durability - Replicas and Recovery High Availability - Replicas and Partition Low Cost - Commodity Hardware Scale Out - No Single Bottleneck, Share Nothing

Reliability Consistent Hashing

Reliability Consistent Hashing with Virtual Node

Reliability Consistent Hashing with Virtual Node

Reliability The advantages of consistent hashing: 1.Metadata is small;(AWS S3 has 762 billion objects) 2.Distribution unformity; 3.Peer to perr comunication; 4.Load blance.

Reliability Virtual node(partition) - Object distribution uniformity Weight - Allocate partitions dynamically Zone - Partition Tolerance Zones can be used to group devices based on physical locations, power separations, network separations, or any other attribute that would lessen multiple replicas being unavailable at the same time. Swift : Ring - The index file for locating object in the cluster.

Reliability

Durability: %? AWS S3 with % Annualized failure rate of devices.(disk, network...) Bit rot rate and the time to detect bit rot. Mean time to recovery. More about durability : openstack-swift-reliable-enough-for.html Swift : Auditor - To detect bit rot; Replicator-To keep the consistency of object;

Consistency Model

Quroum + Object Version + Async Replicatiton Quroum Protocol: N, the number of nodes that store replicas of the data; W, the number of replicas that need to acknowledge the receipt of the update before the update completes; R, the number of replicas that are contacted when a data object is accessed through a read operation; If W+R>N, then the write set and the read set always overlap and one can guarantee strong consistency. In Swift, NWR is configurable. General configuration: N=3, W=2, R=1 or 2, So the swift can providetwo models of consistency, strong and eventual.

Consistency Model Weak Consistency ( N=3,W=2,R=1 )

Consistency Model Strong Consistency ( N=3,W=2,R=2 )

Consistency Model Special Scene : dirty read

Architecture Prototype

Metadata account |-- container1 |------obj1 |------obj2 |-- container2 |------objN How to store the relationship of account, container and object? Relation Database NoSQL, Cassandra, MongoDB Relation Datatbase with Sharding

Metadata The Swift way: sqlite + consistent hashing + quroum A sqlite db file is an object. So the database is HA, durable and with eventual consistency. The target of swift: no single failure, no bottletneck, scale out

Architecture

Swift

Our Works Swift as the SAE Storage -Auth module for SAE(Key-Pair) -Keystone for SAE(Token) -HTTP Cache-Control module -Quota(limit the number of containers and objects, limit the storage usage) -Domain remap app-domain.stor.sinaapp.com/obj To sinas3.com/v1/SAE_app/domain -Rsync with bwlimit -Billing -Storage Firewall Swift as the SWS Simple Storage Service -Container unique module container.sinas3.com/object -Keystone middleware for auth protocol converting

Our Steps for Switching SAE Storage Bypass test one month online( 旁路读写测试 ) Switch step by step one month( 灰度切换 ) Ops Monitoring: I/O, CPU, Memery, Disk LogCenter: syslog-ng, statistics and analytics

Problems&Imporvements The async processor for keeping eventual consistency is inefficient. Replicator, auditor, container-updater Logic: loop all objects or dbs on disk, query replica's server to determin whether to sync. Results: High I/O; High CPU usage; Stress on account/container/object- servers, impact the availability; Long time for eventual consistency, impact the durability; The list operation is not consistent. How to improvements? 1.Runing replicator, auditor and container updater during idle time; 2.An appropriate deployment; 3.A new protocol for keeping relplica's consistency; (based on log and message queue) 4.Adding new nodes, scale out.

Problems&Imporvements The performance of sqlite. Quota for objects and containers Running sqlite on the high performance I/O devices The bandwidth of rsync is not under control. Out-of-band management Add bandwidth limitations for rsync Database centralized or distributed?

Problems&Imporvements An appropriate deployment

Q&A GTalk: Blog: