Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open Storage: Intel’s Investments in Object Storage

Similar presentations


Presentation on theme: "Open Storage: Intel’s Investments in Object Storage"— Presentation transcript:

1 Open Storage: Intel’s Investments in Object Storage
Paul Luse and Tushar Gohad, Storage Division, Intel

2 Transforming the Datacenter through Open Standards
Transforming the Business Transforming the Ecosystem Transforming the Infrastructure Speed-up new application and services deployment on software-defined infrastructure created from widely available IA servers. Strengthen open solutions with Intel code contributions and silicon innovations to speed-up development, while building a foundation of trust. Assure OpenStack based cloud implementations offer highest levels of agility, automation and efficiency using IA platform innovations. Transforming the Infrastructure Intel code contributions and silicon innovations speed software development, deployment, and operation, while building trust in the cloud Transforming the Business OpenStack lowers cost and speeds provisioning of applications and services Transforming the Future Software Defined Infrastructure will let your applications define their environment

3 Legal Disclaimers Copyright © 2014 Intel Corporation. All rights reserved Intel, the Intel logo, Xeon, Atom, and QuickAssist are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel® Advanced Vector Extensions (Intel® AVX)* are designed to achieve higher throughput to certain integer and floating point operations.  Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies.  Performance varies depending on hardware, software, and system configuration and you should consult your system manufacturer for more information.  *Intel® Advanced Vector Extensions refers to Intel® AVX, Intel® AVX2 or Intel® AVX-512.  For more information on Intel® Turbo Boost Technology 2.0, visit No computer system can provide absolute security.  Requires an enabled Intel® processor, enabled chipset, firmware and/or software optimized to use the technologies.   Consult your system manufacturer and/or software vendor for more information. No computer system can provide absolute security. Requires an Intel® Identity Protection Technology-enabled system, including an enabled Intel® processor, enabled chipset, firmware, software, and Intel integrated graphics (in some cases) and participating website/service. Intel assumes no liability for lost or stolen data and/or systems or any resulting damages. For more information, visit Consult your system manufacturer and/or software vendor for more information. No computer system can provide absolute security.  Requires an enabled Intel® processor, enabled chipset, firmware, software and may require a subscription with a capable service provider (may not be available in all countries).  Intel assumes no liability for lost or stolen data and/or systems or any other damages resulting thereof.  Consult your system or service provider for availability and functionality.  No computer system can provide absolute reliability, availability or serviceability.  Requires an Intel® Xeon® processor E7-8800/4800/2800 v2 product families or Intel® Itanium® 9500 series-based system (or follow-on generations of either.)  Built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an internet connection.  Results may vary depending upon configuration.  Consult your system manufacturer for more details. For systems also featuring Resilient System Technologies:  No computer system can provide absolute reliability, availability or serviceability.  Requires an Intel® Run Sure Technology-enabled system, including an enabled Intel processor and enabled technology(ies).  Built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an Internet connection.  Results may vary depending upon configuration.  Consult your system manufacturer for more details. For systems also featuring Resilient Memory Technologies:  No computer system can provide absolute reliability, availability or serviceability.  Requires an Intel® Run Sure Technology-enabled system, including an enabled Intel® processor and enabled technology(ies).  built-in reliability features available on select Intel® processors may require additional software, hardware, services and/or an Internet connection.  Results may vary depending upon configuration.  Consult your system manufacturer for more details.  The original equipment manufacturer must provide TPM functionality, which requires a TPM-supported BIOS. TPM functionality must be initialized and may not be available in all countries. Requires a system with Intel® Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select Intel® processors. Consult your system manufacturer. Performance varies depending on hardware, software, and system configuration. For more information, visit Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, and virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit

4 Agenda Storage Policies in Swift Erasure Coding Policy in Swift
Swift Primer / Storage Policies Overview Swift Storage Policy Implementation Usage Models Erasure Coding Policy in Swift Erasure Coding (EC) Policy Swift EC Design Considerations and Proposed Architecture Python EC Library (PyECLib) Intel® Intelligent Storage Acceleration Library (ISA-L) COSBench Cloud Object Storage Benchmark Status, User adoption, Roadmap Public Swift Test Cluster OPENSTACK SUMMIT 2014 Intel Confidential test 2

5 Storage Policies for OpenStack Swift

6 Swift Primer OpenStack Object Store
Distributed, Scale-out Object Storage CAP Eventually consistent Highly Available – no single point of failure Partition Tolerant Well suited for unstructured data Uses container model for grouping objects with like characteristics Objects are identified by their paths and have user-defined metadata associated with them Accessed via RESTful interface GET, PUT, DELETE Built on standard hardware Cost effective, efficient container object OPENSTACK SUMMIT 2014 Intel Confidential test 2

7 Scalable for concurrency and/or capacity independently
The Big Picture Clients Obj A Upload Download Obj A RESTful API Access Tier Handle incoming requests Handle failures, ganged responses Scalable shared nothing architecture Consistent hashing ring distribution Auth Service Load Balancer Proxy Proxy Proxy Capacity Tier Actual object storage Variable replication count Data integrity services Scale-out capacity Storage Nodes Storage Nodes Storage Nodes Storage Nodes Storage Nodes Copy 3 Copy 1 Copy 2 Zone 1 Zone 2 Zone 3 Zone 4 Zone 5 Scalable for concurrency and/or capacity independently

8 Why Storage Policies? New Opportunities for Swift
Are all nodes equal? Would you like 2x or 3x? Can I add something like Erasure Codes? Current durability scheme applies to entire cluster Can do replication of 2x, 3x, etc., however the entire cluster must use that setting There’s no core capabilities to expose or make use of differentiated hardware within the cluster If several nodes of a cluster have newer/faster characteristics, they can’t be fully realized (the administrator/users are at the mercy of the dispersion algorithm alone for data placement). There’s no extensibility for additional durability schemes Use of erasure codes (EC) Mixed use of schemes (some nodes do 2x, some do 3x, some do EC) OPENSTACK SUMMIT 2014

9 Why Storage Policies? New Opportunities for Swift
Support Grouping of Storage Expose or make use of differentiated hardware with a single cluster Performance tiers – a tier with high-speed SSDs can be defined for better performance characteristics Multiple Durability Schemes Erasure coded Mixed-mode replicated (Gold 3x, Silver 2x etc) Other Usage models Geo tagging – ensure geographical location of data within a container container object Current durability scheme applies to entire cluster Can do replication of 2x, 3x, etc., however the entire cluster must use that setting There’s no core capabilities to expose or make use of differentiated hardware within the cluster If several nodes of a cluster have newer/faster characteristics, they can’t be fully realized (the administrator/users are at the mercy of the dispersion algorithm alone for data placement). There’s no extensibility for additional durability schemes Use of erasure codes (EC) Mixed use of schemes (some nodes do 2x, some do 3x, some do EC) Community effort w/primary contributions from Intel and SwiftStack* OPENSTACK SUMMIT 2014

10 Adding Storage Policies to Swift
3 Different Policies: 3 Different Rings Introduction of multiple object rings Introduction of container tag: X-Storage-Policy Triple Replication Reduced Replication Erasure Codes 3 locations, same object Swift supports multiple rings already, but only one for object – the others are for account and container DB New immutable container metadata Policy change accomplished via data movement Each container is associated with a potentially different ring 2 locations, same object n locations, object fragments

11 Storage Policy Touch Points
wsgi server middleware (partially, modules like list_endpoints) Proxy Nodes swift proxy wsgi application account controller object controller container controller helper functions wsgi server middleware swift object wsgi application replicator expirer auditor updater Storage Nodes swift account wsgi application replicator reaper auditor helper functions DB schema updates swift container wsgi application replicator sync auditor updater

12 Usage Model – Reduced Redundancy
Container with 2x Policy Container with 3x Policy When 3 copies is overkill, a CSP can now offer an alternative simple durability scheme at a different price point. Some or all of the nodes in the cluster can be at this reduced redundancy level and per account reporting features enable easy access to per-policy usage statistics. OPENSTACK SUMMIT 2014

13 Performance Tier Container with HDD Policy Container with SSD Policy SSDs – previously limited to being Used for account/container DB Either storage nodes or just specific high speed disks (SSDs) scattered throughout can be included in a specific ring that provides for better performance characteristics. Note: entire systems can comprise a policy as well…

14 Geo Tagging Geo #1 Geo #2 While not fully fleshed out in all parts of Swift, storage policies could enable the possibility of applications’ ability to ensure the geographical location of data stored within a container.  Where regulatory and other concerns require it, data in a container could be guaranteed to remain in a specific geography even with replication and DR considerations.

15 Erasure Codes Container with EC Policy EC Fragments Container with 3x Policy This is a large feature being built on top of policies and is targeted for the Juno release With EC, a CSP can maintain high levels of durability while dramatically decreasing the storage footprint within the cluster. EC is becoming popular amongst many open source and closed source storage projects and enables new usage models that may have been cost prohibitive before (i.e. cold storage). Note: EC could also be on dedicated HW…

16 Erasure Coding Policy in OpenStack Swift

17 Erasure Codes Object Object split into k data and m parity chunks and distributed across cluster Space-optimal Redundancy and High Availability k = 10, m = 4 translates to 50% space requirement when compared to 3x replication Higher Compute and Network Requirements Suitable for Archival workloads (high write %) D1 D2 D3 Dk P1 Pm

18 Swift with Erasure Coding Policy
Clients Upload Download RESTful API, Similar to S3 Access Tier (Concurrency) Obj A Load Balancer Obj A Auth Service Applications control policy Inline EC EC Encoder EC Decoder Proxy Proxy Proxy Capacity Tier (Storage) Supports multiple policies EC flexibility via plug-in Storage Storage Frag 2 Storage Storage Storage Storage Storage Storage Storage Frag 4 Storage Storage Frag 1 Storage Storage Storage Storage Frag 3 Frag k + m Zone 1 Zone 2 Zone 3 Zone 4 Zone 5 redundancy n = k data fragments + m parity fragments OPENSTACK SUMMIT 2014

19 EC Policy – Design Considerations
First significant (non-replication) Storage Policy in OpenStack Swift In-line Proxy-centric Datapath Design Erasure Code encode/decode during PUT/GET done at the proxy server Aligned with Swift architecture to focus demanding services in the access tier Erasure Coding Policy applied at the Container-level New container metadata will identify whether objects within it are erasure coded Follows from the generic Swift storage policy design Keep it simple and leverage current architecture Multiple new storage node services required to assure Erasure Code chunk integrity as well as Erasure Code stripe integrity; modeled after replica services Storage nodes participate in Erasure Code encode/decode for reconstruction analogous to replication services synchronizing objects Community effort w/ primary contributions from Intel, Box*, SwiftStack* OPENSTACK SUMMIT 2014

20 Erasure Coding Policy Touchpoints
wsgi server middleware Proxy Nodes swift proxy wsgi application existing modules controller modifications EC Library Interface Plug in 1 Plug in 2 wsgi server middleware swift object wsgi application existing modules EC Auditor metadata changes Storage Nodes EC Library Interface EC Reconstructor Plug in 1 Plug in 2 swift container wsgi application existing modules swift account wsgi application metadata changes OPENSTACK SUMMIT 2014

21 Python Erasure Code Library (PyECLib)
existing modules swift proxy server EC modifications to the Object Controller PyECLib (Python) Jerasure (C) ISA-L (C, asm) swift object server existing modules EC Auditor EC Reconstructor PyECLib (Python) Jerasure ISA-L (C, asm) Python interface wrapper library with pluggable C erasure code backends Backend support planned in v1.0: Jerasure, Flat-XOR, Intel® ISA-L EC BSD-licensed, hosted on bitbucket: https://bitbucket.org/kmgreen2/pyeclib Use by Swift at Proxy server and Storage node level – most of the Erasure Coding details opaque to Swift Jointly developed by Box*, Intel and the Swift community Not to be confused with PyEC – Python library for Evolutionary Computing OPENSTACK SUMMIT 2014

22 Intel® ISA-L EC library
Part of Intel® Intelligent Storage Acceleration Library Provides primitives for accelerating storage functions Encryption, compression, de-duplication, integrity checks Current Open Source version provides Erasure Code support Fast Reed Solomon (RS) Block Erasure Codes Includes optimizations for Intel® architecture Uses Intel® SIMD instructions for parallelization Order of magnitude faster than commonly used lookup table methods Makes other non-RS methods designed for speed irrelevant Hosted at https://01.org/storage-acceleration-library BSD Licensed ISA-L Primitives (v2.10) SSE PQ Gen (16+2) SSE XOR Gen (16+1) Reed Solomon EC (16+6,2) SSE MB: SHA-1 SSE MB: SHA-256 SSE MB: SHA-512 SSE MB: MD5 CRC T10 CRC IEEE (802.3) CRC32 iSCSI AES-XTS 128 AES-XTS 256 Compress “Deflate” IGZIP0, IGZIP1, IGZIP0C, IGZIP1C

23 Project Status Additional Information Target: Summer ’14
PyECLib upstream on bitbucket and PyPi Storage Policies in plan for OpenStack Juno EC Expected to coincide with OpenStack Juno Ongoing Development Activities The community uses a Trello discussion board: https://trello.com/b/LlvIFIQs/swift-erasure-codes Launchpad blueprints: https://blueprints.launchpad.net/swift Additional Information Attend the Swift track in the design summit (B302, Thu 5:00pm) Talk to us on #openstack-swift or on the Trello discussion board To give PyECLib a test run, install it from https://pypi.python.org/pypi/PyECLib For information on ISA-L, check out

24 COSBench: Cloud Object Storage Benchmark

25 What is COSBench? Iometer COSBench
Open Source Cloud Object Storage Benchmarking Tool Announced at the Portland design summit 2013 Open Source (Apache License) Cross Platform (Java + Apache OSGI) Distributed load testing framework Pluggable adaptors for multiple object storage backends Flexible workload definition Web-based real-time performance monitoring Rich performance metric reporting (Performance timeline, Response time histogram) Storage backend Auth tempauth swauth keystone direct none basic/digest librados rados GW (swift) rados GW (s3) Amazon* S3 integrated Scality sproxyd CDMI CDMI-base CDMI-swift swauth/keystone None Mock mock Amplidata* Amplistor OpenStack* Swift Ceph Iometer (block) COSBench (object)

26 Workload Configuration
Flexible load control object size distribution Read/Write Operations Workflow for complex stages Flexible configuration for complex workloads

27 Web Console Test Generators Active Workloads History

28 Performance Reporting
Rich performance data help characterization

29 Progress since Havana New Features User Interface Improvements
New Object Store Backends Amazon S3 adapter Ceph adapter (Librados based, and Radosgw based) CDMI adapter (swift through cdmi middleware, scality) Authentication Support HTTP basic and digest Core Functionality New selectors / new operator Object integrity checking Response time breakdown Job management User Interface Improvements Batch Workload Configuration UI Adds Batch Test Configuration to COSBench Makes COSBench workload configuration more like IOmeter Bug Fixes 85 issues resolved Roadmap 0.3.0 (13 Q2) Open source baseline 0.3.x (13Q4) S3 adapter Ceph adapter 0.4.x (14Q2) Usability CDMI adapter 0.5.x (*14Q4) Workload suite Multi-part 0.6.x (*15Q2) Profiling tool Google/MS Storage New selectors: sequential/histogram new operator: filewrite Misc: - SSL - Web console authentication - direct swift access - …

30 github activity (2 weeks)
User Adoption github activity (2 weeks)

31 Contributing to COSbench
Active code repository and community Repository: https://github.com/intel-cloud/cosbench License: Apache v2.0 Mailing-List:

32 Public Swift Test Cluster

33 Public Swift Test Cluster
Joint effort by SwiftStack*, Intel and HGST* 6 Swift PACO Nodes 8-core Intel(R) Atom(TM) CPU 2.40GHz 16GB main memory, 2x Intel X540 10GbE, 4x 1GbE Storage 12x HGST* 6TB Ultrastar(R) He6 Helium-filled HDDs Operating Environment: Ubuntu/Red Hat Linux, OpenStack Swift 1.13 Load Balancing / Management / Control / Monitoring Using SwiftStack* Controller

34


Download ppt "Open Storage: Intel’s Investments in Object Storage"

Similar presentations


Ads by Google