Presentation on theme: "+ Thriving at Petabyte scale & beyond"— Presentation transcript:
1+ Thriving at Petabyte scale & beyond Software-Defined Object Storage for the POST-RAID era
2Expansion & Extensibility Legacy file storage technology is limiting the ability for entities to “scale”ConstrainedExpansion & ExtensibilityRigid & BrittleDurabilityRestrictedAccessibilityWeb servers and load balancers neededSMB and NFS are not web protocolsLimited query and analysisMax out at PB levelData SilosTedious managementService downtime for upgradesSpecialized staffRAID rebuild times too longBackup costs too highNot all data should be treated the same, anywayTraditional storage limits the growth of cloud. File systems max out at a petabyte levels or even less. They require time and attention to backup, maintain, restore, rebuild, expand and they require specialized staff to work on them.The traditional file system method of recovery, RAID, does not scale. It takes the same amount of time to rebuild a 10 TB RAID volume in a 100 TB cluster as it does in a 500 TB Cluster and this is just not practical in real world use.Finally, SMB (CIFS) and NFS are simply not web protocols and require web servers, balancers and many other systems to support the use of their data in the cloud. Why not go to a simpler and more direct model for cloud storage?
3Multi-protocol ingest support (http, file, S3, Swift, block) The Need...Multi-protocol ingest support (http, file, S3, Swift, block)Internet accessible, web-scaleAbility to run on standard commodity hardwareSimple to deploy, manage, scale & evolveAbility to continuously protect any amount of data, nativelySecure and CompliantFuture-proof (durable & extensible)Our clients are requesting straight forward requirements.Storage that supports object, file and block.Storage that runs on standard hardware and automatically protects data – no matter how much there is.A product that is easy to deploy , simple to scale and evolves with hardware so a forklift upgrade is never required.A software defined storage product that supports existing workloads and enables new data services for the future.We call it, scalable software that works so you can focus on your business.
4The Solution – Caringo SWARM CAStor OverviewCaringo’s Swarm object storage software:Aggregates the disk capacity across heterogeneous clusters of commodity servers to…Create massively & dynamically scalable multi- tenant storage systems that are…Self-managing, self-healing, high-performance, and durable through time, whereFiles and their descriptive metadata (i.e. objects) are accessible whenever and wherever they are needed by users, administrators and by data services
5How We Do It – Efficient Software SwarmRuns from RAMon any x86 hardware with LinuxTune to any workloadBoots from bare metal in minutesworks on raw disk levelRAMHTTPSNMPCPUNo file system, no software installleaves 98% drive capacity for data98%availableNICA node to Caringo is simply an x86 server with RAM, CPU, NIC and Disk. No exotic hardware is required. You can use 1 Gig E or 10 Gig E. Sata, Sas, Flash - any kind of media.No Caringo code is ever installed on your disks. This leaves up to 98% of the disk for data storage. Isn’t your data what a storage system should have on its disks?Swarm boots from bare metal and there is no file system or OS installation and the only way in or out is via HTTP or SNMP.That’s what we call a node and we’ll call it Node 1.
6How We Do It – Resource Virtualization SwarmRAMNode 1CPUNIC
7How We Do It – Resource Virtualization SwarmRAMNode 1CPUNICRAMRAMCPUCPUNode 2 can be the same or different.You can use newer hardware or mix and match. The flexibility of Caringo Swarm allows you to configure for your use cases.In this example, there is more of everything (there are reasons to have different configurations which we can help tune to your use case for you).Let’s call this node 2.NIC
8How We Do It – Resource Virtualization SwarmRAMNode 1CPUNICRAMRAMNode 2CPUCPUNIC
9How We Do It – Resource Virtualization SwarmNode 1Node 2Node 3Node …Add another – node 3, node 4…
10How We Do It – Emergent Behavior No single points of failureSWARM follows simple rules to manage a complex systemNodes cooperate to perform processesAnd you get a swarm.The swarm provides growth and scalability, there are no single points of failure and the nodes cooperate to perform processes. In this manner, the system can create a complex system while based on simple rules.Simplicity means fewer things break, better performance, more robustness and great responsiveness.Improves with size!
11How We Do It – Encapsulated Data Metadata stored with dataNo metadata databasesto managedata + metadataSimplicity is key at Caringo and we use it right down to the data level. Remember, simplicity empowers performance, fast recovery and robustness – but it all starts with the data model.Caringo believes in the pure object model. Data plus metadata (information about the data) is encapsulated in an object. There are no metadata databases to manage and the metadata is stored with the object. Simple, consistent and powerful.
12How We Do It – Encapsulated Data Object is encapsulatedIncludes all system, policy andcustom metadataThe metadata includes system metadata, policy information that you set and custom metadata that you model and design. All this information is encapsulated inside the object. This means when you go to get an object, everything is there.
13How We Do It – Encapsulated Data Object is portable & dynamicDecoupled from hardware, location and appsFreeing data to be accessed in multiple waysThis gives your objects pure portability and makes data dynamic. Your data is decoupled from location, from hardware, from applications. Anyone with the right authentication and authorization (which is included in the metadata) can get to the data no matter where they are or where the data may be. You never have to know the location of the data or the application that wrote it and you can access it remotely from a mobile device or another computer.Think about this for a moment. Data freed from hardware, location and application means you can use that data in more ways than ever before. There are no multiple places to go to get the metadata, the data, the authentication… everything is inside a Caringo Object.Since Caringo Swarm is based on HTTP, the lingua franca of the internet, where there is internet access, there is access to your data and everything about it.This simple model is the basis for incredible power. This is the power of the Caringo Swarm.
14Swarm AdaptsSwarm lets you grow at the pace you choose and adapts to your requirements or workloadPlug in new servers for more capacity & performance - address needs in real timeNew resources available in minutesRight resources to the right locationsLocal and geo-distributionEasily retire old hardwaresite 2site 1Swarmsite 3Swarm does not dictate how you grow a cluster. Your business needs do. Multiple clusters can be different configurations. In fact, you can store data differently in different clusters. If you have 2 replicas of an object in your primary cluster, you can store that object as an erasure coded object in the DR cluster.One push of a button on the console lets you retire a node while the system is up and running. That node will distribute its data into the rest of the cluster, wait until all the data has been protected and then shut itself down and wait to be taken out at a time of your choosing.Data can be distributed locally or in a geo-distributed manner. A configuration can be made with 3 different data centers that can lose an entire data center and you can still recover all your data. Caringo brings great flexibilty in the way systems can be configured and of course we are available to assist in planning and implementation.
15Swarm ProtectsSwarm automatically manages the data protection you chooseProtection per cluster, bucket or at the object levelSeamlessly move between replication and erasure codingStore objects with different protection levels on the same serverAuto-managed through LifepointsProtect data based on valueErasureencodingAny SLAReplicationData protection can be done in 2 basic methods: replication and erasure coding. Replication produces an identical replica of your object and stores it on another node. If you lose one node or one disk, you always have a replica to use. The system will quickly re-replicate any lost data. Erasure coding (EC) is a way of splitting up an object into data and parity segments. Similar to a software version of RAID, EC is more flexible, faster and is scalable. In Caringo Swarm, you can choose any EC method. If you choose 5,2 that means there are 5 data segments and 2 parity segments. You can lose any 2 segments (or nodes or disks) and still recover your object. 16,9 means you can lose any 9 segments and still recover your data.There is no single best way to protect data. Caringo provides the flexibility to use the best method for each use case and you can set that down to the object level. With this flexibility, you can support multiple SLAs for multiple customers or clients inside a single cluster.Any object can have any protection associated with it at any time. You can set a default for the cluster or for a tenant or you can set protection down to the object level. You can even change the protection over time through Lifepoints and the system will manage this for you automatically.Caringo is the only vendor that supports the use of replication and erasure coding in the same cluster, in the same nodes with no special purpose nodes or single points of failure.
16Swarm RepairsSwarm watches for issues and repairs them automatically so you don’t have toHealth Processor proactively checks integrity, availability and cardinalityFast Volume Recovery, content aware, repairs only damaged objectsAssurance of data quality for the entire enterpriseProvides business continuity+++++Proactive integrity checking means assurance of data quality for the entire enterprise.Business continuity is dependent on the viability of a company’s data. Caringo assures data quality and thus provides business continuity.Data is an organization’s lifeblood and Caringo takes Assurance of Data Quality seriously.
17Select the interface and hardware for your workload and use case. Portfolio OverviewScalable storage software for unified object, file and blockCloudScalerHTTP 1.1, Amazon S3FileScalerSMB, NFSBlockScaleriSCSI (Beta)Adaptable performance for unified storage under a single namespace.SwarmCaringoSelect the interface and hardware for your workload and use case.Big Data/HadoopArchiveEnterprise AppsCloud
18Add Caringo CSN Software License and annual support/maintenance It’s Simple…Really!CAStor OverviewMix and Match HW w/ any drive sizeGrow from one form factor to another, at runtime, incrementallyNever forklift upgrade againEthernet Switches 1Gb or 10GbSmallMedium, LargeR420R720xd or+JBODAdd Caringo Software Capacity Licenses, per TB, and annual support/maintenanceCaringo SWARM Object Storage ClusterSmall, Medium, LargeR420 or VMAdd Caringo CSN Software License and annual support/maintenanceCaringo Cluster Services Node(Physical or Virtual)
19Simple…But Powerful! Remote Sites Ingest from many sources* CAStor OverviewSearch & DiscoveryMulti-Tenant STaaSHadoop AnalyticsAdd Storage ServicesHTTPS3ApplicationGatewayFileIngest from many sources*Remote SitesDistribute, Consolidate, Share automaticallyruntimeScale in anyAutomate Retention, Protection & Durability down to the object level
20Object storage - its all about use cases Video Surveillance(2.5 PBS) File RetentionDOD Medical Health Systems (16 PBS)PACS ArchivingM2M and analytics(4 PBS)Consumer/enterprise backup-to-cloud service(2+ PBS)S3-like cloud storage services via Swarm & CloudScalerWeb content deliveryplus hundreds more…
21Next Steps – Where/How to get Started? Identify your Target use case(s):See Use Case Slide.If in doubt: “Drain your Filer Swamps” – File System Optimization Tier!Contact your Dell Account Representative for pre-sales support – All Caringo Software & Services are available through DellOR, ContactOur clients are requesting straight forward requirements.Storage that supports object, file and block.Storage that runs on standard hardware and automatically protects data – no matter how much there is.A product that is easy to deploy , simple to scale and evolves with hardware so a forklift upgrade is never required.A software defined storage product that supports existing workloads and enables new data services for the future.We call it, scalable software that works so you can focus on your business.
22Software-Defined Object Storage for the POST-RAID era Thank YouSoftware-Defined Object Storage for the POST-RAID era