Presentation is loading. Please wait.

Presentation is loading. Please wait.

1© Copyright 2015 EMC Corporation. All rights reserved. © Copyright 2015 EMC Corporation. All rights reserved. Under NDA 2 TIERS TM Model Performance of.

Similar presentations


Presentation on theme: "1© Copyright 2015 EMC Corporation. All rights reserved. © Copyright 2015 EMC Corporation. All rights reserved. Under NDA 2 TIERS TM Model Performance of."— Presentation transcript:

1 1© Copyright 2015 EMC Corporation. All rights reserved. © Copyright 2015 EMC Corporation. All rights reserved. Under NDA 2 TIERS TM Model Performance of Flash, with Capacity and Enterprise Reliability of Object Stores 1 Percy Tzelnic Office of the CTO EMC

2 2© Copyright 2015 EMC Corporation. All rights reserved. Long Static, HPC Storage Now Changing Rapidly 2002-20092012-20162016-20172017- EMCEMC EMCEMC Supercomputer Tape Archive Disk-based Parallel File System Object Store EMCEMC EMCEMC

3 3© Copyright 2015 EMC Corporation. All rights reserved. HPC Research Leads To Enterprise Solutions EMC IOD EMC developed open source IOD – exascale I/O technology (based on 2011 CRADA research in Burst Buffer) – DOE funded exascale Storage Research, 2012 to 2014 (Fast Forward) – Semantic data storage, new storage APIs – Small, fast burst buffers tiering to larger slower object stores – Repurposed as simplified data architecture for Enterprise Technology trickle-down: 2 TIERS TM – Fast acceleration tier – Large capacity tier – Performance of flash – Retention and capacity of data lake – Global POSIX namespace over one trillion objects

4 4© Copyright 2015 EMC Corporation. All rights reserved. Storage Array is being disrupted: – Flash replaces disk for +100x performance (flash array) – Cloud replaces disk for +100x capacity (object store) Moving older/cold data to cloud is inevitable – Cloud approaching $0 for data at rest – Capacity disks from arrays move to the cloud, leaving the Array as a Flash only Fast Tier, on-premise We can no longer package Performance and Capacity in one box at an attractive price/value point – Split the two, hence 2 TIERS TM (Fast Tier and Capacity Tier) Technology Context In The Enterprise As The 2 nd Platform Evolves Towards The 3 rd  2 TIERS TM is a Game Changer!

5 5© Copyright 2015 EMC Corporation. All rights reserved. 1.Real-Time Analytics – High Performance (low latency, high bandwidth) 2.Ingest Fast Data – High Speed, High Volume data Ingest 3.Fast & Big Data Ecosystem – Processed Fast Data exported to Data Lake for Big Data Analytics 4.Enterprise 2 nd Platform Analytics – HPDA workloads (e.g., Simulation) Enterprise Fast Data (ingest, real-time analytics) is an emerging market EMC-IOD integrated with Flash products (DSSD, SIO) for Fast Data EMC-IOD integrated with Data Lake products (Isilon, ECS) for Capacity Fast Data Use Cases EMC Solutions are Powered by Intel ® Xeon ® Processor Technology

6 6© Copyright 2015 EMC Corporation. All rights reserved. Cloud Array O(1) File System Database, Data Warehouse block file VNX VMAX Scale-up, No Scale-out Fast (Auto-tiering) Capacity Fast (Auto-tiering) Capacity 3 rd Platform block file object Geo Scale-out Fast Tier Scale-out O(1,000) (Policy-tiering) Hyperscale O(100,000) Capacity Tier Analytics (Structured, unstructured, in-memory) Isilon Cloud Pool Intermediate Scale-out “Array” O(100) Twin Strata 2 nd Platform

7 7© Copyright 2015 EMC Corporation. All rights reserved. Related Work Many see the need for similar technology Products and Open Source are moving towards multi-personality Data Lakes over Object Stores This is good confirmation of two widely resonating concepts: – Object Store for Capacity, Flash for Performance! – But Users think in folders, not objects! We need a Hierarchical Namespace… – Representing an Object Store with a huge number of objects as a Hierarchical Namespace is a challenge Closest approach: LANL MarFS – Shares origins with 2 TIERS TM – Different market targets: Extreme HPC (LANL), vs. Enterprise (EMC)

8 8© Copyright 2015 EMC Corporation. All rights reserved. MarFS and 2 TIERS TM similarities – Motivation Object Store for capacity, but users expect POSIX interface – Basic architecture A POSIX namespace served from a parallel file system Data storage in a 1 trillion object store – Similar challenge Metadata performance for a POSIX namespace holding 1 trillion files MarFS and 2 TIERS TM differences – 2 TIERS TM has an acceleration tier for data performance – Different, complementary, techniques for metadata performance At LANL: MarFS GPFS Server (NSD) Dual Copy Raided enterprise class HD D or SSD Metadata (may have some small data (object lists that are too large to fit in xattrs) Dual Copy Raided enterprise class HD D or SSD Metadata (may have some small data (object lists that are too large to fit in xattrs) GPFS Server (NSD) Batch FTA Mounted GPFS archive, NFS, PanFS, Lustre, and GPFS-MDS hidden Pftool, obj client, PSI Batch FTA Mounted GPFS archive, NFS, PanFS, Lustre, and GPFS-MDS hidden Pftool, obj client, PSI Batch FTA Mounted GPFS archive, NFS, PanFS, Lustre, an d GPFS-MDS hidden Pftool, obj client, PSI Batch FTA Mounted GPFS archive, NFS, PanFS, Lustre, an d GPFS-MDS hidden Pftool, obj client, PSI Object Data Lakes Object Data Lakes Los Alamos Natl. Lab – MarFS BOF 165 … jointly with Gary Grider, LANL Wednesday, 5:30-7:00, Hilton Salon A “Two Tiers Scalable Storage: Building POSIX-Like Namespaces with Object Stores”

9 9© Copyright 2015 EMC Corporation. All rights reserved. A parallel file system supplies the Hierarchical Namespace – e.g., OrangeFS A flash-based acceleration tier – e.g., ScaleIO, DSSD A capacity scale-out data lake – e.g., ECS, Isilon A software package that binds them all together, tiering Data and Metadata; EMC will make this package Open Source – EMC IOD, 2 TIERS TM – Software Defined Storage Building Blocks Of 2 TIERS TM

10 10© Copyright 2015 EMC Corporation. All rights reserved. Stateless design of underlying PVFS2 – Light weight Linux kernel module, multi-threaded client – Leverage Linux containers for additional scale & HA Modular design – Abstract key-value interface for metadata – Abstract storage interface for data – Abstract networking allows RDMA, IP Client changes NOT required Supports Windows and Macintosh clients Future roadmap: OFS V3 – changes for Cloud PaaS; consistent with our direction (2017+) OrangeFS is maintained and developed by Omnibond, Clemson, SC – Agile and responsive open source community – Performance comparable to other PFS – History of 4-5 years in production OrangeFS Choice For 2 TIERS TM

11 11© Copyright 2015 EMC Corporation. All rights reserved. 1.Single File System Namespace with dynamically loadable namespace subsets (DLN) 2.Tiering of both Data and Metadata 3.Fast Tier Performance Target: greater than 10x Capacity Data Lake 4.Direct access (read-only) to the Capacity Tier, bypassing the Fast Tier 5.2 TIERS™ provides Tiering and Non-Tiering modes 6.No client changes required 7.No changes to the products required to instantiate Flash for Fast Tier and Object Store for Capacity Tier Unique Differentiation Of 2 TIERS TM

12 12© Copyright 2015 EMC Corporation. All rights reserved. Disaggregates the monolithic memory / storage / IO Stack and recasts it into loosely coupled “Fast Tier” and “Capacity Tier” 2 TIERS TM is Open Source Software Defined Storage 2 TIERS TM is all about Independent Scaling: – Scale-out for Fast Tier, O(1,000) – Hyperscale for Capacity Tier, O(100,000) 2 TIERS TM deploys equally well on the 2 nd Platform, albeit limited at the Enterprise scale 2 TIERS TM Designed For The 3 rd Platform

13 13© Copyright 2015 EMC Corporation. All rights reserved. Note: Compute Server interconnect RDMA, for best performance 2 TIERS TM Local Fast Tier Example App SIO EMC IOD Flash App SIO EMC IOD Flash App SIO EMC IOD Flash App SIO EMC IOD Flash App Cluster (Compute Servers) + IO Nodes (ION) + Local Flash Isilon, ECS Data Lakes Isilon, ECS Data Lakes

14 14© Copyright 2015 EMC Corporation. All rights reserved. App DSSD App DSSD RDMA 2 TIERS TM Network Fast Tier Example Note: Compute Server & IOD interconnect RDMA, for best performance App Cluster (Compute Servers) IO Nodes (ION) PCIe SAN EMC IOD Isilon, ECS Data Lakes Isilon, ECS Data Lakes

15 15© Copyright 2015 EMC Corporation. All rights reserved. Four 2 TIERS TM Instantiations POSIX Namespace DLN Fast Capacity No Tiering (1) POSIX Namespace PFS for DSSD No Tiering (2) Local FS (Mac, Win, Linux) Fast (Higher B/W > ISLN) No Tiering (3) POSIX Namespace Scale-out 2 TIERS TM ECS, ISLN DSSDISLN Object PCIe NFS OFS Syncer DSSD / ScaleIO OFS OFS ScaleIO Flash / HDD

16 16© Copyright 2015 EMC Corporation. All rights reserved. Demo 1 1.Load several DLNs from objects in the Capacity Tier (ECS) into the Fast Tier (DSSD) 2.Run a Lifesciences job – BLAST – on one of the DLNs, on an 8 servers cluster 3.At job completion, evict the DLNs as objects in the Capacity Tier, with a new version for the one used in the run; leave the Fast Tier empty Demo 2 1.There is no Fast Tier 2.A Translator function maps application required data and metadata from a DLN contained in an object on Capacity Tier as files which the job accesses in local storage 3.At job completion, everything is cleared in local storage while the modified file are written as new objects into the Capacity Tier 2 TIERS ™ Proof Of Concept

17 17© Copyright 2015 EMC Corporation. All rights reserved. Capacity Tier at Time T 0 2T metadata 2T file data IT Packed DLNs Capacity Tier at Time T 1 Read-only, Read-through Translation Service on Local FUSE File System TIME T 0 : Load DLN d, version v Fast Tier on Distributed OrangeFS TIME T 0 : Promote DLN d, version v TIME T 1 : Persist DLN d, version v+1 promoted new modified Flash hyperstub App Local Store

18


Download ppt "1© Copyright 2015 EMC Corporation. All rights reserved. © Copyright 2015 EMC Corporation. All rights reserved. Under NDA 2 TIERS TM Model Performance of."

Similar presentations


Ads by Google