Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t DSS Data architecture challenges for CERN and the High Energy.

Slides:



Advertisements
Similar presentations
A CASE FOR REDUNDANT ARRAYS OF INEXPENSIVE DISKS (RAID) D. A. Patterson, G. A. Gibson, R. H. Katz University of California, Berkeley.
Advertisements

RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
CSCE430/830 Computer Architecture
Henry C. H. Chen and Patrick P. C. Lee
HAIL (High-Availability and Integrity Layer) for Cloud Storage
RETHINK BACKUP & ARCHIVE. 2 Backup and Archive are Top IT Priorities Which of the following would you consider to be your org’s most important IT priorities.
Click to edit Master title style European AFS and Kerberos Conference Welcome to CERN CERN, Accelerating Science and Innovation CERN, Accelerating Science.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
Peer-to-peer archival data trading Brian Cooper and Hector Garcia-Molina Stanford University.
1 Stanford Archival Repository Project Brian Cooper Arturo Crespo Hector Garcia-Molina Department of Computer Science Stanford University.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
Quantum Confidential | LATTUS OBJECT STORAGE JANET LAFLEUR SR PRODUCT MARKETING MANAGER QUANTUM.
1© Copyright 2013 EMC Corporation. All rights reserved. EMC and Microsoft SharePoint Server Performance Name Title Date.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN openlab Open Day 10 June 2015 KL Yong Sergio Ruocco Data Center Technologies Division Speeding-up Large-Scale Storage with Non-Volatile Memory.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
1© Copyright 2013 EMC Corporation. All rights reserved. EMC and Microsoft SharePoint Server Data Protection Name Title Date.
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Introduction to data management.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CERN and Computing … … and Storage Alberto Pace Head, Data.
RAID COP 5611 Advanced Operating Systems Adapted from Andy Wang’s slides at FSU.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Introduction to data management.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
Meeting the Data Protection Demands of a 24x7 Economy Steve Morihiro VP, Programs & Technology Quantum Storage Solutions Group
Huawei IT : Make IT Simple, Make Business Agile
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Libraries, Archives, and Digital Preservation: The Reality of What We Must Do Leslie Johnston Acting Director, National Digital Information Infrastructure.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
Strong Security for Distributed File Systems Group A3 Ka Hou Wong Jahanzeb Faizan Jonathan Sippel.
Continuous Backup for Business CrashPlan PRO offers a paradigm of backup that includes a single solution for on-site and off-site backups that is more.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Virtual visit to the computer Centre Photo credit to Alexey.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Future home directories at CERN
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Castor incident (and follow up) Alberto Pace.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
CERN - IT Department CH-1211 Genève 23 Switzerland CCRC Tape Metrics Tier-0 Tim Bell January 2008.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Introduction to data management Alberto Pace.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Computing for the LHC Alberto Pace
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
Cloud Computing Vs RAID Group 21 Fangfei Li John Soh Course: CSCI4707.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
Seminar On Rain Technology
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Aaron Stanley King. What is SQL Azure? “SQL Azure is a scalable and cost-effective on- demand data storage and query processing service. SQL Azure is.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Storage plans for CERN and for the Tier 0 Alberto Pace (and.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
A Tale of Two Erasure Codes in HDFS
CMPE Database Systems Workshop June 16 Class Meeting
Business Continuity & Disaster Recovery
Managing Storage in a (large) Grid data center
Intelligent Archiving for Media & Entertainment
Experiences and Outlook Data Preservation and Long Term Analysis
Thoughts on Computing Upgrade Activities
Business Continuity & Disaster Recovery
Real IBM C exam questions and answers
RAID RAID Mukesh N Tekwani
ICOM 6005 – Database Management Systems Design
CSE 451: Operating Systems Winter 2009 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
CSE 451: Operating Systems Winter 2012 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura 1.
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy Physics community in the coming years Alberto Pace Head, Data and Storage Services CERN, Geneva, Switzerland

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 2 Storage infrastructure at CERN Physics data and File services for the physics community –More than 2’000 servers, 80’000 disks, –2.2 billion files, 40 PB of online storage –100 PB of offline storage on > 50’000 tape cartridges, 150 tape dtives –Sustaining write rates of 6 GB/s Important grow rates –Doubled between 2012 and 2013 –Expected to double every 18 months

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 3 The mission of data management We are the repository of the world data of High Energy Physics Mandate: –Store –Preserve –Distribute (i.e. make accessible)

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 4 Challenges ahead Scalability –Ensuring that the existing infrastructure can scale up by a factor of 10 (exabyte scale) Reliability –Ensure that we can deliver an arbitrary high level of reliability Performance –Ensure that we can deliver an arbitrary level of performance Why ?

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 5 The cloud model of the past User would store the data in a “unique pool of data” No need to move the data, it is managed “in the cloud” Simple and unique quality of service

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 6 Data pools Different quality of services –Three parameters: (Performance, Reliability, Cost) –You can have two but not three Slow Expensive Unreliable TapesDisks Flash, Solid State Disks Mirrored disks

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 7 Cloud storage ≠ big data Unique quality of service becomes too simplistic when dealing with big data –Recent data to be analyzed require very high storage performance. A generic solution can be too slow for this data –Data that has been processed must be archived reliably at minimum cost. A generic solution can be too expensive for this data –Derived data can, in some case, be recomputed. A generic solution can be too reliable for this data (and therefore too expensive)

Data & Storage Services Real life data management ? Examples from LHC experiment data models  Two building blocks to empower data processing  Data pools with different quality of services  Tools for data transfer between pools

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 9 Challenges ahead Key requirements: Simple, Scalable, Consistent, Reliable, Available, Manageable, Flexible, Performing, Cheap, Secure. Aiming for “à la carte” services (storage pools) with on- demand “quality of service”

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 10 Challenges: what is needed Scalability –Ensuring that the existing infrastructure can scale up by a factor of 10 (exabyte scale) Reliability –Ensure that we can deliver an arbitrary level of reliability Performance –Ensure that we can deliver an arbitrary level of performance Cost –Ensure that for a given level of reliability / performance, we are “cost efficient”

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Examples of challenges

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS Scalability improvements Reduced mount rates

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 13 Reliability improvements Change “on the fly” reliability of storage pools by supporting multiple data encoding algorithms –Plain (reliability of the service = reliability of the hardware) –Replication Reliable, maximum performance, but heavy storage overhead Example: 3 copies, 200% overhead –Reed-Solomon, double, triple parity, NetRaid5, NetRaid6 Maximum reliability, minimum storage overhead Example 10+3, can lose any 3, remaining 10 are enough to reconstruct, only 30 % storage overhead –Low Density Parity Check (LDPC) / Fountain Codes / Raptor Codes Excellent performance, more storage overhead Example: 8+6, can lose any 3, remaining 11 are enough to reconstruct, 75 % storage overhead (See next slide)

Data & Storage Services Example: 8+6 LDPC checksum 100% (original data size) 175% (total size on disk) Any 11 of the 14 chunks are enough to reconstruct the data using only XOR operations (very fast) 0.. 7: original data : data xor-ed following the arrows in the graph 138% (min size required to reconstruct) You are allowed to lose any 3 chunks (21 %)

Data & Storage Services Internet Services Performance Draining example...

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 16 High performance parallel read Data reassembled directly on the client

CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services DSS 17 Conclusion We are only at the beginning of what “Data management” services can do Efficient “Data management” services still need to be invented ! Plenty of research opportunities –Algorithms (encoding / decoding / error corrections) –Storage media, SSD (r)evolution, role of tapes –Innovative operation procedure (data verification, data protection, data movements) –Architectures (reducing “dependencies of failures, scalability, …)