CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Storage Overview and IT-DM Lessons Learned Luca Canali, IT-DM DM Group Meeting 10-3-2009.

Slides:



Advertisements
Similar presentations
Archive Task Team (ATT) Disk Storage Stuart Doescher, USGS (Ken Gacke) WGISS-18 September 2004 Beijing, China.
Advertisements

NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
CERN IT Department CH-1211 Genève 23 Switzerland t Storage for Data Management and Physics Databases Luca Canali, IT-DM After-C5 Presentation.
2 June 2015 © Enterprise Storage Group, Inc. 1 The Case for File Server Consolidation using NAS Nancy Marrone Senior Analyst The Enterprise Storage Group,
Storage area Network(SANs) Topics of presentation
SQL Server, Storage And You Part 2: SAN, NAS and IP Storage.
High Performance Computing Course Notes High Performance Storage.
IBM® Spectrum Storage Virtualize™ V V7000 Unified in a nutshell
DSN-6000 series Introduction. ShareCenter Pulse DNS-320 SMB/ Entry SME SOHO/ Consume r SOHO DNS-315 Capacity and Performance D-Link Storage Categories.
How to Cluster both Servers and Storage W. Curtis Preston President The Storage Group.
Module – 7 network-attached storage (NAS)
Storage Networking. Storage Trends Storage growth Need for storage flexibility Simplify and automate management Continuous availability is required.
Session 3 Windows Platform Dina Alkhoudari. Learning Objectives Understanding Server Storage Technologies Direct Attached Storage DAS Network-Attached.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Product Manager Networking Infrastructure Choices for Storage.
© Hitachi Data Systems Corporation All rights reserved. 1 1 Det går pænt stærkt! Tony Franck Senior Solution Manager.
© 2010 IBM Corporation Storwize V7000 IBM’s Solution vs The Competition.
© 2009 Oracle Corporation. S : Slash Storage Costs with Oracle Automatic Storage Management Ara Vagharshakian ASM Product Manager – Oracle Product.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
Storage Area Networks The Basics. Storage Area Networks SANS are designed to give you: More disk space Multiple server access to a single disk pool Better.
MaxTronic International Co., Ltd. May 2013 Focused on Data Protection.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Storage Systems: Advanced Topics Learning Objectives: To understand major characteristics of SSD To understand Logical Volume Management – its motivations.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Luca Canali, CERN Dawid Wojcik, CERN
Best Practices for Backup in SAN/NAS Environments Jeff Wells.
Module – 4 Intelligent storage system
IBM System Storage™ DS3000 Series
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Storage Systems Market Analysis Dec 04. Storage Market & Technologies.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Storage Trends: DoITT Enterprise Storage Gregory Neuhaus – Assistant Commissioner: Enterprise Systems Matthew Sims – Director of Critical Infrastructure.
Sandor Acs 05/07/
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
CERN IT Department CH-1211 Geneva 23 Switzerland t IT/DB Tests and evolution SSD as flash cache.
COMP25212 STORAGE SYSTEM AND VIRTUALIZATION Sergio Davies Feb/Mar 2014COMP25212 – Storage 3.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN IT Department CH-1211 Genève 23 Switzerland t Possible Service Upgrade Jacek Wojcieszuk, CERN/IT-DM Distributed Database Operations.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
ISCSI. iSCSI Terms An iSCSI initiator is something that requests disk blocks, aka a client An iSCSI target is something that provides disk blocks, aka.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Copyright © 2010 Hitachi Data Systems. All rights reserved. Confidential – NDA Strictly Required Hitachi Storage Solutions Hitachi HDD Directions HDD Actual.
CERN Disk Storage Technology Choices LCG-France Meeting April 8 th 2005 CERN.ch.
System Storage TM © 2007 IBM Corporation IBM System Storage™ DS3000 Series Jüri Joonsaar Tartu.
Storage HDD, SSD and RAID.
Open-E Data Storage Software (DSS V6)
Storage Area Networks The Basics.
Video Security Design Workshop:
iSCSI Storage Area Network
Storage Overview and IT-DM Lessons Learned
What is Fibre Channel? What is Fibre Channel? Introduction
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Storage Virtualization
Oracle Storage Performance Studies
Case studies – Atlas and PVSS Oracle archiver
ASM-based storage to scale out the Database Services for Physics
Storage Trends: DoITT Enterprise Storage
Scalable Database Services for Physics: Oracle 10g RAC on Linux
IT 344: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Chia-Chi Teng CTB
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Storage Overview and IT-DM Lessons Learned Luca Canali, IT-DM DM Group Meeting

CERN IT Department CH-1211 Genève 23 Switzerland t Outline Goal: review of storage technology –HW layer (HDs, storage array) –Interconnect (how to attach storage to the server) –Service layer (filesystems) Expose current hot topics in storage –Identify challenges –Stimulate ideas for management of large data volumes

CERN IT Department CH-1211 Genève 23 Switzerland t Why storage is a very interesting area in the coming years Storage market is very conservative –Few vendors share market for large enterprise solutions –Enterprise storage has typically a high premium Opportunities –Commodity HW/grid-like solutions provide order of magnitude gain in cost/performance –New products coming to the market promise many changes: –Solid state disks, high capacity disks, high performance and low cost interconnects

CERN IT Department CH-1211 Genève 23 Switzerland t HW layer – HD, the basic element Hard disk technology –Basic block of storage since 40 years –Main intrinsic limitation: latency

CERN IT Department CH-1211 Genève 23 Switzerland t HD specs HDs are limited –In particular seek time is unavoidable (7.2k to 15k rpm, ~2-10 ms) – IOPS –Throughput ~100MB/s, typically limited by interface –Capacity range 300GB -2TB –Failures: mechanical, electric, magnetic, firmware issues. MTBF: 500k -1.5M hours

CERN IT Department CH-1211 Genève 23 Switzerland t Enterprise disks Performance –enterprise disks offer more performance: –They spin faster and have better interconnect protocols (e.g. SAS vs SATA) –Typically of low capacity –Our experience: often not competitive in cost/perf vs. SATA

CERN IT Department CH-1211 Genève 23 Switzerland t HD failure rates Failure rate –Our experience: it depends on: vendor, temperature, infant mortality, age. –At FAST’07 2 papers (one from Google) showed that vendor specs often need to be ‘adjusted’ in real life. –Google data seriously questioned usefulness of SMART probes and correlation of temperature/age/usage with MTBF. –Other study showed that consumer and enterprise disks have similar failure pattern and life time. Moreover HD failures in RAID sets have correlations.

CERN IT Department CH-1211 Genève 23 Switzerland t HD wrap-up HD is a old but evergreen technology –In particular disk capacities have increased of one order of magnitude in just a few years –At the same time prices have gone down (below 0.1 USD per GB for consumer products) –1.5 TB consumer disks, and 450GB enterprise disks are common –2.5’’ drives are becoming standard to reduce power consumption

CERN IT Department CH-1211 Genève 23 Switzerland t Scaling out the disk The challenge for storage systems –Scale out the disk performance to meet demands –Throughput –IOPS –Latency –Capacity Sizing storage systems –Must focus on critical metric(s) –Avoid ‘capacity trap’

CERN IT Department CH-1211 Genève 23 Switzerland t RAID and redundancy Storage arrays are the traditional approach –implement RAID to protect data. –Parity based: RAID5, RAID6 –Stripe and mirror: RAID10 Scalability problem of this method –For very large configurations MTBF ~ RAID rebuild time (!) –Challenge: RAID does not scale

CERN IT Department CH-1211 Genève 23 Switzerland t Beyond RAID Google and Amazon don’t use RAID Main idea: –Divide data in ‘chunks’ –Write multiple copies of the chunks –Google file system: writes chunks in 3 copies –Amazon S3: write copies at different destinations, i.e. data center mirroring Additional advantages: –Removes the constraint of locally storing redundancy inside one storage arrays –Can move, refresh, or relocate data chunks easily

CERN IT Department CH-1211 Genève 23 Switzerland t Our experience Physics DB storage uses ASM –Volume manager and cluster file system integrated with Oracle –Soon to be also a general-purpose cluster file system (11gR2 beta testing) –Oracle files are divided in chunks –Chunks are distributed evenly across storage –Chunks are written in multiple copies (2 or 3 it depends on file type and configuration) –Allows the use of low-cost storage arrays: does not need RAID support

CERN IT Department CH-1211 Genève 23 Switzerland t Scalable and distributed file systems on commodity HW Allow to manage and protect large volumes of data Solutions proven by Google and Amazon, Sun’s ZFS, Oracle’s ASM Can provide order of magnitude savings on HW acquisition Additional scale savings by deployment of cloud and virtualization models Challenge: solid and scalable distributed file systems are hard to build

CERN IT Department CH-1211 Genève 23 Switzerland t The interconnect Several technologies available –SAN –NAS –iSCSI –Direct attach

CERN IT Department CH-1211 Genève 23 Switzerland t The interconnect Throughput challenge –It takes 3 hours to copy/backup 1TB over 1 GBPS network

CERN IT Department CH-1211 Genève 23 Switzerland t IP based connectivity NAS, iSCSI suffer from poor performance of Gbps Ethernet 10 Gbps may/will(?) change the picture At present not widely deployed on servers because of cost Moreover TCP/IP has CPU overhead

CERN IT Department CH-1211 Genève 23 Switzerland t Specialized storage networks SAN is the de facto standard for most enterprise level storage Fast, low overhead on server CPU, easy to configure Our experience (and Tier1s): SAN networks with max 64 ports at low cost –Measured: 8 Gbps transfer rate (4+4 dual ported HBAs for redundancy and load balancing) –Proof of concept FC backup (LAN free) reached full utilization of tape heads –Scalable: proof of concept ‘Oracle supercluster’ of 410 SATA disks, and 14 dual quadcore servers

CERN IT Department CH-1211 Genève 23 Switzerland t NAS CERN’s experience of NAS for databases Netapp filer can use several protocols, the main being NFS –Throughput limitation because of TCP/IP –Trunking is possible to alleviate the problem, main solution may/will(?) be to move to 10Gbps The filer contains a server with CPU and OS –In particular the proprietary WAFL filesystem is capable of creating read-only snapshots –Proprietary Data ONTAP OS runs on the filer box –Additional features make worse cost/performance

CERN IT Department CH-1211 Genève 23 Switzerland t iSCSI iSCSI is interesting for cost reduction Many concerns on performance though, due to IP interconnect –Adoption seems to be only for low-end systems at the moment Our experience: –IT-FIO is acquiring some test units, we have been announced that some test HW will be available for IT-DM databases

CERN IT Department CH-1211 Genève 23 Switzerland t The quest for ultimate latency reduction Solid state disks provide unique specs –Seek time are at least one order of magnitude better than best HDs –A single disk can provide >10k random read IOPS –High read throughput

CERN IT Department CH-1211 Genève 23 Switzerland t SSD (flash) problems Flash based SSD still suffer from major problems for enterprise solutions –Cost/GB: more than 10 times vs. ‘normal HDs’ –Small capacity compared to HDs –They have several issues with write performance –Limited number of erase cycles –Need to write entire cells (issue for transactional activities) –Some workarounds for write performance and cell lifetime improvements are being implemented, different quality from different vendors and grade –A field in rapid evolution

CERN IT Department CH-1211 Genève 23 Switzerland t Conclusions Storage technologies are in a very interesting evolution phase –On one side ‘old-fashioned storage technologies’ give more capacity and performance for a lower price every year –New technologies are emerging for scaling out very large data sets (see Google, Amazon, Oracle’s ASM, SUN’s ZFS) –10 Gbps Ethernet and SSD have the potential to change storage in the coming years (but are not mature yet)

CERN IT Department CH-1211 Genève 23 Switzerland t Acknowledgments Many thanks to Jacek, Dawid and Maria Eric and Nilo Helge, Tim Bell and Bernd