Presentation on theme: "Roberto Tolini - NetApp Business Solutions Architect EMEA NetApp Distributed Content Repositories: What Are We Doing in Real Life?"— Presentation transcript:
Roberto Tolini - NetApp Business Solutions Architect EMEA NetApp Distributed Content Repositories: What Are We Doing in Real Life?
2 “Big Content” and Object Storage StorageGRID: Overview and Architecture Where does it fit? Use cases and target markets Competition overview How to Prove it works? PoC, Test and Demo capabilities Summary, resources, and contacts for EMEA Agenda
3 An Introduction Big Content and Object Storage
4 What Does Your Corporate Data Look Like? Human-generated and machine- generated file data represent ~80% of all corporate data This data cannot be deleted, even though… …97% of this data will never be touched again It’s too expensive to keep this data on primary storage
6 All That Data Is Stressing the Infrastructure Challenges Rapid, untamed growth of unstructured data Perpetually retain large and growing datasets Distributed users and app environment Needs PB scale, billions of objects, reduced operational overhead, efficient management Policy-based placement, seamless technology refresh Predictable, location- independent access anywhere, anytime
7 BlockFileObject So: what is exactly Object Storage? Specific location on disks / memory Tracks Sectors Specific folder in fixed logical order File path File name Date Flexible container size Data and Metadata Unique ID
8 Distributed Content Repositories Based on NetApp StorageGRID Software Large content repository for big, unstructured data Billions of data sets, dozens of petabytes Create, manage and consume content globally Predictable access to data independent of location Policy-controlled data stores at each site Intelligent data classification and access Metadata-based managemen t
10 StorageGRID −Acquisition of Bycast Inc. in 2010 with a decade of object storage innovation −Footprint in long-term archive, healthcare market −Since 2010, expansion from healthcare to telecom/service providers −IBM OEM customers transitioned to NetApp-branded product −Currently in version 9 (9.0.2) −First product to support industry-standard object storage CDMI A bit of history
11 NetApp StorageGRID Solution CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI MULTIPLE: APPLICATIONS + SITES + PROTOCOLS MULTIPLE: TARGETS and TIERS MULTIPLE: TENANTS and POLICIES and ADMINISTRATORS Site 1Site 2… Site NSite 3 APPLICATIONS APPLICATION Disk Storage Tape
12 StorageGRID® CIFS NFS HTTP CIFS NFS HTTP ILM Policy Management MULTIPLE: APPLICATIONS + SITES + PROTOCOLS Site 1Site 2 App1App2 ILM Evaluation… 1.Number of copies 2.Storage location 3.Storage tier 4.Retention period Policy File1 Metadata: FPTH starts with “/app1share/*” File2 Metadata: XTYP equals “bronze”
13 Distributed Content Repository StorageGRID features and capabilities summary StorageGRID MULTIPLE: APPLICATIONS + SITES + PROTOCOLS MULTIPLE: TARGETS + VENDORS + TIERS MULTIPLE: TENANTS + POLICIES + ADMINISTRATORS Site 1Site 2… Site N Site 3 APPLICATIONS APPLICATION Tape NetApp E-Series Storage Systems Technical Overview −Multi-protocol: CIFS, NFS, RESTful HTTP −Scale-out architecture: capacity, count, sites, tiers, tenants, throughput −Policy-management: copies, locations & tiers on ingest / over time −Object storage: compression, encryption, fingerprint + metadata, WORM −HA & DR: NDO, active+active for data and metadata, self-healing
15 HTTP API via Gateway Simple Client Implementation Gateway load-balances sessions across available Storage Nodes Storage Nodes perform HTTP API transactions
16 Sample Code – Storing Data Application Code HTTP “PUT” Request Grid Response “PUT” accepted Grid Response Object received UUID returned to app Data Transfer
17 StorageGRID Is an Object Storage software solution Is a software component (Bycast) Runs on a computing layer (d efault option: VMs) Holds the “intelligence”; manages data according to defined policies Data (objects) are stored in a storage layer; d ifferent types supported. The whole “solution” is referred to as “NetApp DCR.” What Is StorageGRID?
18 StorageGRID Software Servers or Hosts Storage Network DCR Solution components Example: 2 Sites - DC and DR solution
20 Where does it fit? Use cases and target markets
21 Where is DCR Solution a good fit? −In general: wherever there is a need for preservation, compliance, data integrity, distributed repositories (multi-site, multi-tenant), high availability, scalability, etc. And where it’s not… −In general: highly transactional data, “dumb” storage, “none of the above”, etc… Target markets and use case examples: −HealthCare: PACS (imaging), Electronic Health Records −File and email archiving −“Dropbox-like”, iCloud-like cloud services (sharing, synchronizing) −Cloud archiving, backup: legal archiving, service providers, knowledge preservation DCR Solution: Target Markets
22 Customer existing Application(s) −StorageGRID is mainly accessed by applications rather than users directly. Example: PACS, Document management, etc... −Is there an existing integration/reference with customer application? (Using API integration vs filesystem) Customer needs or problems to solve −We might need to propose an application that can solve customer problems and leverage StorageGRID capabilities. Examples: archiving, “dropbox” −It might make sense to bring in a partner ISV or develop an ad hoc solution with a NetApp partner What I do Need to Consider?
23 StorageGRID can be accessed in two ways: −FileSystem: by exporting a CIFS/NFS share to users/applications −HTTP API (SG API OR CDMI): by presenting a URL Why does it matter? What are the differencies? −File System: simple and immediate No integration needed. −API: need a “connector” to application (integration) Why develop or integrate with API? −Almost infinite scalabilty, truly unique namespace, can leverage metadata in a much more efficient way. File System vs. API: Why Should I Care?
24 Main reasons for the choice: −Applications can leverage Object Storage to enhance metadata use for data management −Applications can use a truly global namespace via API −StorageGRID provides real distributed content management (DR, ILM for data, HTTP access) −Performances and scalability model (scale-out with “blocks”). Almost infinite scalability. −Long-term data integrity guarantee built-in Why NetApp StorageGRID? General Considerations
25 Business requirements: −Preserve medical records for long term (integrity guarantee) −Ensure compliance to regulations (HIPAA, EU, etc...) −Guaranteed data accessibility, distribution and sharing Solution: −A PACS and/or EHR application implemented on NetApp StorageGRID infrastructure Use Case 1: Healthcare (PACS and EHR)
26 Description Billing (for Clud model or managed service) Infrastructure Flat subscription (level-of-service) Per effective usage (GB OR Objects stored/retrieved) On Premises/Local Cloud service (example Iron Mountain) Managed Service NetApp StorageGRID Front-end application (PACS) (optional) Middleware application (i.e.: DeJearnette, ForeCare,etc...) Doctors and healthcare professionals Small-Medium Hospitals Large hospitals (DR) Service Providers Administrator Web-based configuration for infrastructure (local and centralized), capacity, access, etc... Offer elements Levels of service (onsite infrastructure vs remote infrastructure. Retention, etc...) «Compliance» on data (audit, WORM, managed lifecycle, etc...) Core Optionals 2 1 Delivery Models Solution allows store patient health records (PACS, others) in a «cloud» (Grid). Data can be either totally offsite (only local «cache» installed at customer), totalli onsite or both onsite and offsite («cache» + local storage at customer). Data integrity guarantee (self-healing), DR and compliance (HIPAA, etc...). Managed object lifecycle. Target Market Segments Solution Summary: One-Pager Managed Health Records repository Hospital Medical record generation (Exam, X- Ray, etc...) Patients Interface with PACS Local AND/OR Cloud Archive, SP facilities 4 3
27 Business requirements: −Offload of less accessed files from primary storage −Archiving of files (with or without legal value) −Archiving of emails (MS Outlook, Lotus Domino, etc...). Solution: −An application for „file (or email) archiving “ implemented on NetApp StorageGRID infrastructure Use Case 2: File and Email Archiving
28 It is a solution that includes a file (and, in some cases, email) archiving/tiering application that enables offload of user content from primary storage to other data storage tier(s). NetApp StorageGRID is used as secondary tier and provides the distributed content infrastructure Application moves data from primary storage based on different parameters (age, metadata, etc...) to StorageGRID. Solution can leverage StorageGRID data management features (ILM, data protection, self-healing, multiple sites synchronization, etc...) Use Case (Solution) Overview
29 File Archiving: Theory Of Operations Inactive files are moved to StorageGRID (stubbed or not stubbed depending on the methodology used) Event-based, policy-based. User- initiated, MS SharePoint, etc... HTTP/ CDMI StorageGRID
31 File archiving and Email Archiving −Symantec EV (API integration) File archiving/Tiering −NTP Software OSCC (API integration) −PoINT Software Storage Manager (API integration) −F5 ARX (CIFS “integration”, validated architecture) Other solutions (general approach) −FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application) Solutions Examples: A Real-life “Taste”
32 Business requirements: −A “Cloud File Hosting” Service for retail customers (end users) and/or businesses (“private Dropbox”-style). Solution: −A partner application for “Cloud File Hosting” implemented on NetApp StorageGRID infrastructure Use Case Example 3: Cloud File Sharing
33 In general it is a file hosting solution for individuals (retail customers) or enterprises; it was developed for NetApp StorageGRID It syncronizes files accross desktop, laptop and smartphones (iPhone, Android, Blackberry and Windows Mobile), allowing users to share and access them from everywhere Can be “white labelled”, customized, run stand-alone or integrated with billing systems, CRM, LDAP (for users access) NetApp StorageGRID provides the distributed content infrastructure What Is Cloud File Sharing?
34 Description Billing Infrastructure provider Flat subscription Per effective usage (GB OR Objects stored/retrieved) Self-provisioning through provider Web portal (user creation, level of service, etc...) On Line (SP portal) Direct/Indirect sales (B2B) NetApp StorageGRID Front-end application (Partner or SP- customized) Professionals Small-Medium Enterprises End users Large Corporates Online folder creation Users authentication Users invited and files shared via e-mail o SMS (with or without password) File synchronization between PC, tablet and smartphone. Search and archive capabilities Access to folders limited to selected groups (employees, suppliers, customers) Administrator Web-based configuration for infrastructure capacity, access, etc... Offer elements Levels of service «Compliance» on data (audit, WORM, managed lifecycle, etc...) CoreOptionals 4 2 3 1 Sales Model Activation Solution allows to share documents and information within working groups inside company or with external entities Multi-channel access (desktop, web,mobile, etc...) with content synchronization Target Market Segments Solution Summary: One-Pager Secure file sharing Private «Dropbox» Users 2 2
35 Turk Telekom “BuluttDepo” (MRD “Nimbus” application) −API-based integration with StorageGRID −Developed specifically for Service Providers Mezeo Cloud −API-based integration with StorageGRID −Multi-purpose “Cloud” application Other solutions (general approach) −FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application) Solutions Examples: A Real-life "Taste"
37 Well, first and foremost ourselves...but we’re improving −“Blurred” border between object storage and “scale-out NAS” solutions. Not always easy to understand which is best fit. We often end up competing both with Object Storage and NAS solutions. Main competitors: −EMC: Atmos and Centera (typically in banking sector), Isilon (typically in Service Providers) −HDS: HCP (Hitachi Content Platform) and HUS (Unified Storage, now with http and object interface) −IBM SONAS (scale-out NAS) −DDS WoS (Web Object Scalar) and others (less in EMEA, more in U.S.) Who Is The “Enemy”?
38 Understand the workload −Object counts & sizes Sweet-spot object >500KB Counts up to 8B, Capacity to 35PB −Performance requirements 100MB/s ingest per file system namespace 10Gbps aggregate ingest/retrieve via object APIs Look for the ISV that completes the puzzle −Enterprise Archive −Media −Healthcare Strategies to Win Do not undervalue E-Series! –Rock-solid enterprise arrays 350,000 systems deployed WW –Density and performance 1.8PB – 2.4PB per rack
39 Object Storage Vendor Ecosystem Scale-out Object Store (Traditional) Scale-out Object Store (Startups) Open Source Key Value Store (Centera) (Atmos)
40 We have some good material on FieldPortal −Forrester Report: Total Economic impact of NetApp DCR Solution https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=91451&contentID=122053 −ESG Lab Validation report NetApp DCR https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=80710&contentID=99326 −EMC Atmos: CAT Competitive presentation https://fieldportal.netapp.com/Core/DownloadDoc.aspx?documentID=94618&contentID=129854 Other resources are available internally at the moment (just ask if you need), but they will soon be made available on FieldPortal. Competitive Resources
41 How to prove it works? Test, PoC and performances testing guidelines
42 StorageGrid: demo /PoC capabilities Lab-on-demand Targeted for online demos (1-2 hours) Requires access to NetApp LoD StorageGrid-in-a-Laptop (SiL) Complete set of functions, can be done «on-the-fly» Grid Nodes consolidated and pre-configured Fits in a laptop low resources consumption StorageGrid-in-a-Box (SiB) Complete set of functions, can be done onsite. Needs server (or servers) Allows for higher performance
43 StorageGrid: demo /PoC capabilities (cont) “Full system” (SG full stack of components) Full Grid deployment Allows for full performances Needs server (or servers) and E-Series storage Needs onsite work for implementation Lead time impacted: −Purchase of demo equipment −HW delivery time −Talk to your TPM!
44 NetApp StorageGRID: Lab-on-Demand https://labondemand.netapp.com https://labondemand.netapp.com Needs registration (partner, eventually customer) Full guided Lab (1h) or «free session»
45 StorageGRID-in-a-Laptop (SiL): overview Pre-packaged set of 2 VMs to be installed on Vmware Workstation 7/8 or ESXi 5 Can be deployed at customer site Can be installed on a laptop or server Includes a set of prepackaged test cases/scripts (additional Linux VM) and PoC guide Limited customization options Compressed images ~7GB Needed space: ~120GB (max) Ask us for details
46 NetApp StorageGRID: SiL topology Additional Linux VM for running tests/scripts Individual VMs can be spread across multiple servers if needed (smaller servers, using existing resources Some nodes can be turned down if needed Vmware ESX/ESXi/Workstations/Server
48 StorageGRID-in-a-Box (SiB): overview Pre-packaged set of VMs to install on Vmware ESXi 5 Can be deployed at customer site Need at least a server to be installed on Includes a set of prepackaged test cases/scripts (additional Linux VM) and PoC guide Can be customized according to test needs Can be spread across multiple servers/sites Compressed images ~40GB, needed space 800GB-1.2TB Ask us for details
49 SiB Configuration (example) One Intel Sever, with two 6-core Xeon processors, 48GB memory, 8x600GB SAS disks, four GbE NICs One 8 port 1GbE switch (or customer switch) Vmware ESXi 5 StorageGRID Software License
50 Typical Test Cases and proof points File System access: −CIFS/NFS basic access, FSG cache behavior, replication, WORM, etc… HTTP API access −SG API and/or CDMI ingest/retrieve, metadata update, etc… ILM (object lifecycle management) −Automatic content placement based on metadata, etc… Data integrity (content self-healing and inclusive protection) HA and “no single-point-of-failure design” Integration with application(s): involve application vendors or developers!
52 Learn what StorageGRID is and which are the use cases where we can more effectively position it Understand which are the critical points to address in each of them and leverage existing experiences Get in touch with people who can help you in EMEA. (You’re welcome. ) Key Takeaways: What Should I Remember?
53 NetApp DCR Solution: −http://www.netapp.com/us/solutions/big-data/distributed-content- repositories.htmlhttp://www.netapp.com/us/solutions/big-data/distributed-content- repositories.html NetApp Fieldportal DCR solution landing page: −https://fieldportal.netapp.com/applications/storagegrid.aspx#14550https://fieldportal.netapp.com/applications/storagegrid.aspx#14550 Contacts for EMEA −Roberto Tolini: email@example.com@netapp.com −Philippe Wackers: firstname.lastname@example.org@netapp.com −Or contact your local NetApp office/TPM Resources and Information