Download presentation
Presentation is loading. Please wait.
Published byMargarita Underdown Modified over 10 years ago
1
Roberto Tolini - NetApp Business Solutions Architect EMEA
NetApp Distributed Content Repositories: What Are We Doing in Real Life? Roberto Tolini - NetApp Business Solutions Architect EMEA
2
Agenda “Big Content” and Object Storage
StorageGRID: Overview and Architecture Where does it fit? Use cases and target markets Competition overview How to Prove it works? PoC, Test and Demo capabilities Summary, resources, and contacts for EMEA
3
Big Content and Object Storage
An Introduction
4
What Does Your Corporate Data Look Like?
Human-generated and machine- generated file data represent ~80% of all corporate data This data cannot be deleted, even though… …97% of this data will never be touched again It’s too expensive to keep this data on primary storage
5
Some numbers to start
6
All That Data Is Stressing the Infrastructure
Challenges Rapid, untamed growth of unstructured data Perpetually retain large and growing datasets Distributed users and app environment Needs PB scale, billions of objects, reduced operational overhead, efficient management Policy-based placement, seamless technology refresh Predictable, location-independent access anywhere, anytime
7
So: what is exactly Object Storage?
Block File Object Specific location on disks / memory Tracks Sectors Specific folder in fixed logical order File path File name Date Flexible container size Data and Metadata Unique ID A content repository or cloud can be an “object storage” system, but what does that really mean? Every method of storing data involves ways to address, or refer to that data. Block-level storage refers to content stored at specific locations on disks or in memory. This can be a very efficient way to store databases with values that have a fixed length and change frequently. Values in a data table can be mapped directly to locations on disk with little translation involved. File level storage requires that every file be given a name and stored in a specific folder. File systems are limited in the number of files and folders they can reference, and in the length of path names. This tends not to be an issue on desktop computers that only store millions of files, but becomes a limitation for enterprise storage involving billions or trillions of files. Due to their hierarchical nature, one always has to specify that folders be arranged in a fixed logical order. For example, a set of folders containing invoices might be arranged by customer, then by the type of service provided. Later, changing this structure to a different arrangement can be very complex and time-consuming. Instead of providing a block-oriented interface that reads and writes fixed sized blocks of data or organizing data in a hierarchical series of file folders, Object Storage organizes data into flexible-sized data containers, called objects. Each object has both data (an un-interpreted sequence of bytes) and metadata (an extensible set of attributes describing the object). Object based storage uses unique IDs to identify files and packages these along with extensible metadata about the object. This allows data to be referenced and queried based on anything about the file. The types of identifier tags used allow for the indexing of files in quantities several orders of magnitude higher than a file system, making object storage ideal for enterprise storage distributed over wide areas.
8
Distributed Content Repositories Based on NetApp StorageGRID Software
Large content repository for big, unstructured data Billions of data sets, dozens of petabytes Create, manage and consume content globally Predictable access to data independent of location Policy-controlled data stores at each site Intelligent data classification and access Metadata-based management The NetApp solution for Distributed Content Repositories is based on NetApp StorageGRID, which was designed from the ground up to solve Big Content challenges in globally distributed environments. With the Distributed Content Repository solution customers can store petabytes of capacity and billions of files in a single, global container that can spread dozens or hundreds of sites across the globe. In addition, StorageGRID leverages object storage technology to offer long retention times (often measured in decades) and the ability to store, manage and retrieve data based on metadata – or “data about your data”. StorageGRID uses metadata-based management for data classification and access, meaning that StorageGRID manages where data is physically stored, how many copies exist (and where) for disaster recovery purposes, how long those copies are retained and when they are destroyed. Further, metadata-based access to your data means that instead of looking for a file name, you simple look for “Mortgage documents”, customer “John Doe”, account number “123456” – greatly simplifying how your applications interact with your storage.
9
NetApp StorageGRID: Overview and Architecture
10
A bit of history StorageGRID
Acquisition of Bycast Inc. in 2010 with a decade of object storage innovation Footprint in long-term archive, healthcare market Since 2010, expansion from healthcare to telecom/service providers IBM OEM customers transitioned to NetApp-branded product Currently in version 9 (9.0.2) First product to support industry-standard object storage CDMI
11
NetApp StorageGRID Solution
MULTIPLE: APPLICATIONS + SITES + PROTOCOLS Site 1 Site 2 Site 3 … Site N APPLICATIONS APPLICATIONS APPLICATIONS APPLICATION CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI CIFS NFS HTTP/ CDMI MULTIPLE: TENANTS and POLICIES and ADMINISTRATORS Disk Storage Tape StorageGRID makes it possible for complex storage networks involving multiple applications using multiple protocols spread across multiple sites to all be seamlessly managed as a single entity. StorageGRID can provide secure public or private cloud storage services to multiple tenants, each with their own policies and administrators. It also allows for storage to be organized into arbitrary storage pools that can overlap and be grouped by tier. Presenter 1 StorageGRID makes possible complex storage network multiple applications multiple app protocols multiple sites multiple storage technology manage as a single entity. MULTIPLE: TARGETS and TIERS NetApp Confidential – Limited Use
12
MULTIPLE: APPLICATIONS + SITES + PROTOCOLS
ILM Policy Management MULTIPLE: APPLICATIONS + SITES + PROTOCOLS Metadata: FPTH starts with “/app1share/*” File1 App1 App2 CIFS NFS HTTP CIFS NFS HTTP File2 Metadata: XTYP equals “bronze” StorageGRID® ILM Evaluation… Number of copies Storage location Storage tier Retention period Policy Let’s walk through a quick example of how data flows on storage. Applications write data to StorageGRID as to any network-attached file system with folders & subfolders When the data is written, two parallel activities take place One, the metadata for the file is evaluated and a policy is applied; these policies determine the number of copies to be created, the location for each copy, what tier of storage the copy resides on at each location, and what happens to it over time Second, the data itself is compressed, encrypted, and a digital fingerprint is generated Together, the data and metadata are combined to create a managed object In this example, one copy stored at Site 1 and replicated – according to policy – to Site 2 Site 1 Site 2
13
NetApp E-Series Storage Systems
Distributed Content Repository StorageGRID features and capabilities summary MULTIPLE: APPLICATIONS + SITES + PROTOCOLS Technical Overview Multi-protocol: CIFS, NFS, RESTful HTTP Scale-out architecture: capacity, count, sites, tiers, tenants, throughput Policy-management: copies, locations & tiers on ingest / over time Object storage: compression, encryption, fingerprint + metadata, WORM HA & DR: NDO, active+active for data and metadata, self-healing Site 1 Site 2 Site 3 … Site N APPLICATIONS APPLICATIONS APPLICATIONS APPLICATION StorageGRID MULTIPLE: TENANTS + POLICIES + ADMINISTRATORS NetApp E-Series Storage Systems Tape Can the Grid be shared among different applications/tenants? YES! Multiple sites/tenants/applications and several methods of separating them. I struggle to understand the difference between Filesystem and API? SG can present (= allowing applications or users to access it) either by presenting a Filesystem (CIFS/NFS) AND/OR presenting an URL for applications to write to (HTTP). API are used to enable I/O operations on SG. SG API vs CDMI. What are the differences? They do similar things. Both use HTTP protocol. SG API are the “Bycast API”. CDMI (Cloud Data Management Interface) is a newer set of API developed by SNIA and become ISO standard in CDMI is the future of development, however SG API will be still supported for long time. MULTIPLE: TARGETS + VENDORS + TIERS
14
Example: SG data Flows in Multi-Site
15
HTTP API via Gateway Simple Client Implementation
4/12/2017 HTTP API via Gateway Simple Client Implementation Gateway load-balances sessions across available Storage Nodes Storage Nodes perform HTTP API transactions 15
16
Sample Code – Storing Data
Application Code HTTP “PUT” Request Grid Response “PUT” accepted Data Transfer Grid Response Object received UUID returned to app (c) 2005 Bycast Inc.
17
What Is StorageGRID? StorageGRID Is an Object Storage software solution Is a software component (Bycast) Runs on a computing layer (default option: VMs) Holds the “intelligence”; manages data according to defined policies Data (objects) are stored in a storage layer; different types supported. The whole “solution” is referred to as “NetApp DCR.”
18
DCR Solution components
StorageGRID Software Servers or Hosts Storage Network ⁞ Notes (Q&A on this slide): Is StorageGrid available in blocks only? NO! We are flexible. Using “block” is aimed at making things easier (think about “FlexPods”) Can I configure a customized solution? YES! Depending on customer needs we can create any block. Just remember: it’s not ONLY about storage capacity, we do need to design for number of objects, I/O, HA, retention, data protection, ILM, etc… We are here to help you. Can I use a different storage than E-Series? YES. Under certain restrictions you can use FAS or even 3rd party storage (special cases). FPVR needed for the moment. You mentioned a “DCR Rack”. Is it available in EMEA? Not yet. It is available in US only. Example: 2 Sites - DC and DR solution
19
Building Block Software – 9.0 (main)
NetApp StorageGRID 9.0 (9.0.2 today) SUSE Linux Enterprise Software 11 SP2 VMware ESXi 5.0 (upd1) VMware ESXi SG ⁞ Do we have a preferred server vendor? NO! We are flexible. Depending on country and deal we can use any vendor or even customer-supplied servers. We just provide the requirements (HW resources) and the VM-to-physical mapping (see blocks as example) Can we use a different Hypervisor? Not yet, But it will come. For the time being we use Vmware ESXi. Note that we don’t need any Vmware advanced feature for SG . What do we quote as NetApp? Partner? NetApp: storage and StorageGrid license. PS for implementation (*) Partner: servers, SLES and VMware vSphere (**). Unless Customer has them already. (if needed) API integration services. (*) Unless partner is enable to delivery PS services for SG. (**) For small deployments the “free” ESXi is enough
20
Where does it fit? Use cases and target markets
21
DCR Solution: Target Markets
Where is DCR Solution a good fit? In general: wherever there is a need for preservation, compliance, data integrity, distributed repositories (multi-site, multi-tenant), high availability, scalability, etc. And where it’s not… In general: highly transactional data, “dumb” storage, “none of the above”, etc… Target markets and use case examples: HealthCare: PACS (imaging), Electronic Health Records File and archiving “Dropbox-like”, iCloud-like cloud services (sharing, synchronizing) Cloud archiving, backup: legal archiving, service providers, knowledge preservation RT notes: The slide is aimed at showing which are the best targets for DCR solution. The general rule is that solution is best fit where there is a need for preservation of data on long term ensuring its integrity and accessibility. Technological refresh is another key point. Target markets and use cases are based on current field experience and are the most common uses of this solution in real-life customer environmens. The list is not exhaustive but represents 90% of the current deployments.
22
What I do Need to Consider?
Customer existing Application(s) StorageGRID is mainly accessed by applications rather than users directly. Example: PACS, Document management, etc... Is there an existing integration/reference with customer application? (Using API integration vs filesystem) Customer needs or problems to solve We might need to propose an application that can solve customer problems and leverage StorageGRID capabilities. Examples: archiving, “dropbox” It might make sense to bring in a partner ISV or develop an ad hoc solution with a NetApp partner
23
File System vs. API: Why Should I Care?
StorageGRID can be accessed in two ways: FileSystem: by exporting a CIFS/NFS share to users/applications HTTP API (SG API OR CDMI): by presenting a URL Why does it matter? What are the differencies? File System: simple and immediate No integration needed. API: need a “connector” to application (integration) Why develop or integrate with API? Almost infinite scalabilty, truly unique namespace, can leverage metadata in a much more efficient way.
24
Why NetApp StorageGRID? General Considerations
Main reasons for the choice: Applications can leverage Object Storage to enhance metadata use for data management Applications can use a truly global namespace via API StorageGRID provides real distributed content management (DR, ILM for data, HTTP access) Performances and scalability model (scale-out with “blocks”). Almost infinite scalability. Long-term data integrity guarantee built-in
25
Use Case 1: Healthcare (PACS and EHR)
Business requirements: Preserve medical records for long term (integrity guarantee) Ensure compliance to regulations (HIPAA, EU, etc...) Guaranteed data accessibility, distribution and sharing Solution: A PACS and/or EHR application implemented on NetApp StorageGRID infrastructure
26
Solution Summary: One-Pager
Delivery Models Interface with PACS On Premises/Local Cloud service (example Iron Mountain) Managed Service 4 Local AND/OR Cloud Archive, SP facilities Hospital 3 Billing (for Clud model or managed service) Patients Flat subscription (level-of-service) Per effective usage (GB OR Objects stored/retrieved) 2 Infrastructure Medical record generation (Exam, X-Ray, etc...) NetApp StorageGRID Front-end application (PACS) (optional) Middleware application (i.e.: DeJearnette, ForeCare,etc...) 1 Web-based configuration for infrastructure (local and centralized), capacity, access, etc... Administrator Offer elements Description Target Market Segments Core Optionals Solution allows store patient health records (PACS, others) in a «cloud» (Grid). Data can be either totally offsite (only local «cache» installed at customer), totalli onsite or both onsite and offsite («cache» + local storage at customer). Data integrity guarantee (self-healing), DR and compliance (HIPAA, etc...). Managed object lifecycle. Doctors and healthcare professionals Small-Medium Hospitals Large hospitals (DR) Service Providers Levels of service (onsite infrastructure vs remote infrastructure. Retention, etc...) «Compliance» on data (audit, WORM, managed lifecycle, etc...) Managed Health Records repository
27
Use Case 2: File and Email Archiving
Business requirements: Offload of less accessed files from primary storage Archiving of files (with or without legal value) Archiving of s (MS Outlook, Lotus Domino, etc...). Solution: An application for „file (or ) archiving “ implemented on NetApp StorageGRID infrastructure
28
Use Case (Solution) Overview
It is a solution that includes a file (and, in some cases, ) archiving/tiering application that enables offload of user content from primary storage to other data storage tier(s). NetApp StorageGRID is used as secondary tier and provides the distributed content infrastructure Application moves data from primary storage based on different parameters (age, metadata, etc...) to StorageGRID. Solution can leverage StorageGRID data management features (ILM, data protection, self-healing, multiple sites synchronization, etc...)
29
File Archiving: Theory Of Operations
Inactive files are moved to StorageGRID (stubbed or not stubbed depending on the methodology used) Event-based, policy-based. User- initiated, MS SharePoint, etc... HTTP/ CDMI StorageGRID
30
File Archiving: Theory Of Operations #2
31
Solutions Examples: A Real-life “Taste”
File archiving and Archiving Symantec EV (API integration) File archiving/Tiering NTP Software OSCC (API integration) PoINT Software Storage Manager (API integration) F5 ARX (CIFS “integration”, validated architecture) Other solutions (general approach) FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application)
32
Use Case Example 3: Cloud File Sharing
Business requirements: A “Cloud File Hosting” Service for retail customers (end users) and/or businesses (“private Dropbox”-style). Solution: A partner application for “Cloud File Hosting” implemented on NetApp StorageGRID infrastructure
33
What Is Cloud File Sharing?
In general it is a file hosting solution for individuals (retail customers) or enterprises; it was developed for NetApp StorageGRID It syncronizes files accross desktop, laptop and smartphones (iPhone, Android, Blackberry and Windows Mobile), allowing users to share and access them from everywhere Can be “white labelled”, customized, run stand-alone or integrated with billing systems, CRM, LDAP (for users access) NetApp StorageGRID provides the distributed content infrastructure
34
Solution Summary: One-Pager
Sales Model On Line (SP portal) Direct/Indirect sales (B2B) 3 File synchronization between PC, tablet and smartphone. Search and archive capabilities Activation Self-provisioning through provider Web portal (user creation, level of service, etc...) Billing 2 2 4 Access to folders limited to selected groups (employees, suppliers, customers) Users Flat subscription Per effective usage (GB OR Objects stored/retrieved) Online folder creation Users authentication Users invited and files shared via o SMS (with or without password) Infrastructure provider NetApp StorageGRID Front-end application (Partner or SP-customized) 1 Web-based configuration for infrastructure capacity, access, etc... Administrator Offer elements Description Target Market Segments Core Optionals Levels of service «Compliance» on data (audit, WORM, managed lifecycle, etc...) Solution allows to share documents and information within working groups inside company or with external entities Multi-channel access (desktop, web,mobile, etc...) with content synchronization Professionals Small-Medium Enterprises End users Large Corporates Secure file sharing Private «Dropbox»
35
Solutions Examples: A Real-life "Taste"
Turk Telekom “BuluttDepo” (MRD “Nimbus” application) API-based integration with StorageGRID Developed specifically for Service Providers Mezeo Cloud Multi-purpose “Cloud” application Other solutions (general approach) FSG: CIFS/NFS shares are used whenever there is not a specific API integration (any other application)
36
Competition overview
37
Who Is The “Enemy”? Well, first and foremost ourselves...but we’re improving “Blurred” border between object storage and “scale-out NAS” solutions. Not always easy to understand which is best fit. We often end up competing both with Object Storage and NAS solutions. Main competitors: EMC: Atmos and Centera (typically in banking sector), Isilon (typically in Service Providers) HDS: HCP (Hitachi Content Platform) and HUS (Unified Storage, now with http and object interface) IBM SONAS (scale-out NAS) DDS WoS (Web Object Scalar) and others (less in EMEA, more in U.S.)
38
Strategies to Win Understand the workload Do not undervalue E-Series!
Object counts & sizes Sweet-spot object >500KB Counts up to 8B, Capacity to 35PB Performance requirements 100MB/s ingest per file system namespace 10Gbps aggregate ingest/retrieve via object APIs Look for the ISV that completes the puzzle Enterprise Archive Media Healthcare Do not undervalue E-Series! Rock-solid enterprise arrays 350,000 systems deployed WW Density and performance 1.8PB – 2.4PB per rack
39
Object Storage Vendor Ecosystem
Scale-out Object Store (Traditional) (Startups) Open Source Key Value Store (Atmos) (Centera)
40
Competitive Resources
We have some good material on FieldPortal Forrester Report: Total Economic impact of NetApp DCR Solution ESG Lab Validation report NetApp DCR EMC Atmos: CAT Competitive presentation Other resources are available internally at the moment (just ask if you need), but they will soon be made available on FieldPortal.
41
How to prove it works? Test, PoC and performances testing guidelines
42
StorageGrid: demo /PoC capabilities
Lab-on-demand Targeted for online demos (1-2 hours) Requires access to NetApp LoD StorageGrid-in-a-Laptop (SiL) Complete set of functions, can be done «on-the-fly» Grid Nodes consolidated and pre-configured Fits in a laptop low resources consumption StorageGrid-in-a-Box (SiB) Complete set of functions, can be done onsite. Needs server (or servers) Allows for higher performance
43
StorageGrid: demo /PoC capabilities (cont)
“Full system” (SG full stack of components) Full Grid deployment Allows for full performances Needs server (or servers) and E-Series storage Needs onsite work for implementation Lead time impacted: Purchase of demo equipment HW delivery time Talk to your TPM!
44
NetApp StorageGRID: Lab-on-Demand
Needs registration (partner, eventually customer) Full guided Lab (1h) or «free session» Session is: «Insight 2012 BD Hands-On StorageGRID 9.0” Lab needs to be reserved (timeframe must be specified) After session done lab is reset to initial status Full guide for lab available online or as pdf
45
StorageGRID-in-a-Laptop (SiL): overview
Pre-packaged set of 2 VMs to be installed on Vmware Workstation 7/8 or ESXi 5 Can be deployed at customer site Can be installed on a laptop or server Includes a set of prepackaged test cases/scripts (additional Linux VM) and PoC guide Limited customization options Compressed images ~7GB Needed space: ~120GB (max) Ask us for details
46
NetApp StorageGRID: SiL topology
Additional Linux VM for running tests/scripts Individual VMs can be spread across multiple servers if needed (smaller servers, using existing resources Some nodes can be turned down if needed Vmware ESX/ESXi/Workstations/Server NetApp Confidential – Limited Use
47
NetApp StorageGRID: SiL overview
NetApp Confidential – Limited Use
48
StorageGRID-in-a-Box (SiB): overview
Pre-packaged set of VMs to install on Vmware ESXi 5 Can be deployed at customer site Need at least a server to be installed on Includes a set of prepackaged test cases/scripts (additional Linux VM) and PoC guide Can be customized according to test needs Can be spread across multiple servers/sites Compressed images ~40GB, needed space 800GB-1.2TB Ask us for details
49
SiB Configuration (example)
One Intel Sever, with two 6-core Xeon processors, 48GB memory, 8x600GB SAS disks, four GbE NICs One 8 port 1GbE switch (or customer switch) Vmware ESXi 5 StorageGRID Software License This is just a configuration example. The servers specifications can vary according to the desired capacity. It can also be deployed on more than a single server.
50
Typical Test Cases and proof points
File System access: CIFS/NFS basic access, FSG cache behavior, replication, WORM, etc… HTTP API access SG API and/or CDMI ingest/retrieve, metadata update, etc… ILM (object lifecycle management) Automatic content placement based on metadata, etc… Data integrity (content self-healing and inclusive protection) HA and “no single-point-of-failure design” Integration with application(s): involve application vendors or developers!
51
Summary, Resources, and Contacts for EMEA
52
Key Takeaways: What Should I Remember?
Learn what StorageGRID is and which are the use cases where we can more effectively position it Understand which are the critical points to address in each of them and leverage existing experiences Get in touch with people who can help you in EMEA. (You’re welcome. )
53
Resources and Information
NetApp DCR Solution: repositories.html NetApp Fieldportal DCR solution landing page: Contacts for EMEA Roberto Tolini: Philippe Wackers: Or contact your local NetApp office/TPM
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.