CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc

Slides:



Advertisements
Similar presentations
XenData SXL-5000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 210 TB to 1.18 PB, designed for the demanding.
Advertisements

2  Industry trends and challenges  Windows Server 2012: Modern workstyle, enabled  Access from virtually anywhere, any device  Full Windows experience.
XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
1 Storage Today Victor Hatridge – CIO Nashville Electric Service (615)
1 © Copyright 2013 EMC Corporation. All rights reserved. EMC CLOUD TIERING APPLIANCE.
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
Seagate Hybrid Systems & Solutions
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Designing Storage Architectures for Preservation Collections Library of Congress, September 17-18, 2007 Preservation and Access Repository Storage Architecture.
XenData Digital Archives Simplify your video archive workflow XenData LTO Video Archive Solutions Overview © Copyright 2013 XenData Limited.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Experiences and Challenges running CERN's High-Capacity Tape Archive 14/4/2015 CHEP 2015, Okinawa2 Germán Cancio, Vladimír Bahyl
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
15 Copyright © 2005, Oracle. All rights reserved. Performing Database Backups.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
15 Copyright © 2007, Oracle. All rights reserved. Performing Database Backups.
Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
CERN openlab V Technical Strategy Fons Rademakers CERN openlab CTO.
 CASTORFS web page - CASTOR web site - FUSE web site -
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Future home directories at CERN
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Unit 1: IBM Tivoli Storage Manager 5.1 Overview. 2 Objectives Upon the completion of this unit, you will be able to: Identify the purpose of IBM Tivoli.
© 2012 IBM Corporation IBM Linear Tape File System (LTFS) Overview and Demo.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
Storage & Database Team Activity Report INFN CNAF,
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
CTA: CERN Tape Archive Rationale, Architecture and Status
Commvault and Nutanix October Changing IT landscape Today’s Challenges Datacenter Complexity Building for Scale Managing disparate solutions.
PHD Virtual Technologies “Reader’s Choice” Preferred product.
E-Business Infrastructure PRESENTED BY IKA NOVITA DEWI, MCS.
PaaS services for Computing and Storage
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
The Holmes Platform and Applications
Transparent Cloud Tiering
CASTOR: possible evolution into the LHC era
Jean-Philippe Baud, IT-GD, CERN November 2007
File Syncing Technology Advancement in Seafile -- Drive Client and Real-time Backup Server Johnathan Xu CTO, Seafile Ltd.
Scalable sync-and-share service with dCache
Dynamic Storage Federation based on open protocols
Ian Bird, CERN & WLCG CNAF, 19th November 2015
Unified Data Access and MGMT. in Distributed hybrid Cloud
Managing Storage in a (large) Grid data center
Data Bridge Solving diverse data access in scientific applications
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Experiences and Outlook Data Preservation and Long Term Analysis
dCache Scientific Cloud
Introduction to Data Management in EGI
Storage for Science at CERN
CTA: CERN Tape Archive Adding front-ends and back-ends Status report
Ákos Frohner EGEE'08 September 2008
CTA: CERN Tape Archive Overview and architecture
Research Data Archive - technology
Storage Virtualization
Capitalize on modern technology
A Web-Based Data Grid Chip Watson, Ian Bird, Jie Chen,
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
MyCloudIT Enables Partners to Drive Their Cloud Profitability Using CSP-Enabled Desktop Hosting Automation with Microsoft Azure and Office 365 MICROSOFT.
Lecture 1: Multi-tier Architecture Overview
Media365 Portal by Ctrl365 is Powered by Azure and Enables Easy and Seamless Dissemination of Video for Enhanced B2C and B2B Communication MICROSOFT AZURE.
DataOptimizer Transparent File Tiering for NetApp Storage Robert Graf
Matthew Farmer Making Azure Integration Services Real
Presentation transcript:

CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc

CERN IT-Storage Group

Storage Services TSM CASTOR AFS ceph NFS CVMFS RBD S3 SSD

Vision for CERN Storage Build a flexible and uniform infrastructure Maintain and enhance (grid) data management tools

Vision for CERN Storage Unlimited storage for LHC experiments, at exabyte scale EOS Disk pools (on-demand reliability, on-demand performance) currently 165 PB deployed CERN Tape Archive (high reliability, low cost, data preservation) currently 190 PB stored A generic home directory service – currently 3 PB To cover all requirements of personal storage Shared among all clients and services Fuse mounts, NFS and SMB exports, Desktop Sync and Web/HTTP/DAV Access

EOS Development Strategy Evolve flexible storage infrastructure as scale-out solution towards exabyte scale one storage – many views on data – analysis/analytics service integration suitable for LAN & WAN deployments enable cost optimized deployments Evolve file-system interface /eos multi-platform access using standard protocol bridges CIFS, NFS, HTTP Evolve namespace scalability scale-out implementation Evolve scale-out storage back-end support object disks, object stores, public cloud, cold storage

EOS Challenges Remote access APIs ⇒ filesystem API Meta data scale-up ⇒ scale-out in-memory namespace ⇒ in-memory scale-up namespace cache + persistency in scale-out KV store 1st key project: EOS FUSE rewrite 2nd key project: EOS Citrine & QuarkDB

CERNBox Strategic service to access EOS storage Unprecedented performances LHC experiments data workflows Access to all CERN physics (and accelerator) bulk data Disconnected / Offline Work Easy Sharing model Main Integration Access Point SWAN, Office Online, … Checksum and consistency

Architecture Web browser access Sync client (CERNBox) Physicists (web access) CERNBOX web interface Sync client (CERNBox) http sync protocol (offline access) Tape Archive Physicists Engineers Administrative staff (online access) http, fuse Webdav, NFS, SMB, xroot Clusters (Batch, Plus, Analytics, …) Applications (Page 1, LHC Logging, …) Disk Pools (EOS)

Data Archiving evolution Data written to tape 2008-2016 Preserving the past … Protect against bit rot (data corruption, bit flips, environmental elements, media wear out and breakage …) Migrate data across technology generations, avoiding obsolescence Some of our data is 40 years old … and anticipating the future 2017-2018: 100PB/year 2021++ : 150PB/year

Storage Services Evolution NFS CVMFS RBD S3 TSM CTA CASTOR ceph SSD

Data Archiving: Evolution EOS + tape … EOS is the strategic storage platform Tape is the strategic long term archive medium EOS + tape = Meet CTA : the CERN Tape Archive Streamline data paths, software and infrastructure CTA will provide optimise

namespace and redirector What is CTA? CTA is A tape backend for EOS A preemptive tape drive scheduler A clean separation between disk and tape EOS disk server CTA tape server Central CTA instance CTA front-end EOS namespace and redirector Tape file catalogue, archive / retrieve queues, tape drive statuses, archive routes and mount policies Scheduling information and tape file locations Tape drive Tape library One EOS instance per experiment Archive and retrieve requests Tape files appear in EOS namespace as replicas. EOS workflow engine glues EOS to CTA CTA metadata Files

What is CTA? EOS plus CTA is a “drop in” replacement for CASTOR Same tape format as CASTOR – only need to migrate metadata Current deployments with CASTOR Future deployments with EOS plus CTA CASTOR Experiment Tape libraries EOS Files EOS + CTA Experiment Tape libraries EOS Files

When is CTA? End 2018 Mid 2018 Mid 2017 Ready for LHC experiments First release for additional copy use cases Mid 2018 Ready for small experiments End 2018 Ready for LHC experiments

CTA outside CERN CTA will be usable anywhere EOS is used CTA could go behind another disk storage system but would require contributing some development: The disk storage system must manage the disk and tape lifecycle of each file The disk storage system needs to be able to transfer files from/to CTA (currently xroot - CTA can easily be extended to support other protocols) CTA currently uses Oracle for the tape file catalogue CTA has a thin RDBMS layer that isolates Oracle specifics from the rest of CTA The RDMS layer means CTA could be extended to run with a different database technology

Backup Service TSM cost is proportional to backed up volume Minimize backup volume in view of upcoming license retendering Ongoing efforts to limit annual growth rate: 24% (50% in the past): Moved AFS backups to CASTOR Working with the two other large backup customers Backup Service total data since 2012

Question?