16/4/2004Storage Resource Sharing with CASTOR1 Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov.

Slides:

Advertisements

Similar presentations

Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.

Advertisements

Peter Berrisford RAL – Data Management Group SRB Services.

16/9/2004Features of the new CASTOR1 Alice offline week, 16/9/2004 Olof Bärring, CERN.

CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.

Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.

PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.

Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.

Oracle Enterprise Grid Mark McGill Principal Sales Consultant Oracle EMEA Enterprise Technology Centre.

CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.

Building Advanced Storage Environment Cheng Yaodong Computing Center, IHEP December 2002.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

Storage Tank in Data Grid Shin, SangYong(syshin, #6468) IBM Grid Computing August 23, 2003.

GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh

20-22 September 1999 HPSS User Forum, Santa Fe CERN IT/PDP 1 History  Test system HPSS 3.2 installation in Oct 1997 IBM AIX machines with IBM 3590 drives.

D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Integrating JASMine and Auger Sandy Philpott Thomas Jefferson National Accelerator Facility Jefferson Ave. Newport News, Virginia USA 23606

Functional description Detailed view of the system Status and features Castor Readiness Review – June 2006 Giuseppe Lo Presti, Olof Bärring CERN / IT.

4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.

CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.

O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.

Storage and Storage Access 1 Rainer Többicke CERN/IT.

Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.

CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.

CASTOR status Presentation to LCG PEB 09/11/2004 Olof Bärring, CERN-IT.

CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.

CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,

RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,

CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.

HEPIX Backup Survey David Asbury CERN/IT/FIO HEPIX, Rome, 6 April 2006.

6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.

MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.

Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.

CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.

DIRAC Project A.Tsaregorodtsev (CPPM) on behalf of the LHCb DIRAC team A Community Grid Solution The DIRAC (Distributed Infrastructure with Remote Agent.

01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.

AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

12 Mars 2002LCG Workshop: Disk and File Systems1 12 Mars 2002 Philippe GAILLARDON IN2P3 Data Center Disk and File Systems.

1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.

CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.

Storage & Database Team Activity Report INFN CNAF,

Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.

DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.

CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.

IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.

Automated File Server Disk Quota Management May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department Sandia is.

CASTOR: possible evolution into the LHC era

Jean-Philippe Baud, IT-GD, CERN November 2007

Status and plans Giuseppe Lo Re INFN-CNAF 8/05/2007.

Giuseppe Lo Re Workshop Storage INFN 20/03/2006 – CNAF (Bologna)

Emil Knezo PPARC-LCG-Fellow CERN IT-DS-HSM August 2002

The LHC Computing Challenge

Castor services at the Tier-0

Ákos Frohner EGEE'08 September 2008

The INFN Tier-1 Storage Implementation

CTA: CERN Tape Archive Overview and architecture

OffLine Physics Computing

CERN, the LHC and the Grid

CASTOR: CERN’s data management system

INFNGRID Workshop – Bari, Italy, October 2004

Presentation transcript:

16/4/2004Storage Resource Sharing with CASTOR1 Olof Barring, Benjamin Couturier, Jean-Damien Durand, Emil Knezo, Sebastien Ponce (CERN) Vitali Motyakov (IHEP)

16/4/2004Storage Resource Sharing with CASTOR2 Introduction CERN European Laboratory for particle physics (Geneva) (Celebrating its 50 th anniversary) 2007: The Large Hadron Collider (LHC) –4 International Experiments: Alice, Atlas, CMS, LHCb –High energy proton or heavy ions collisions (energy up to 14 TeV for proton beams and 1150 TeV for lead ion beams). – Bigger amounts of data to store/analyze compared to previous experiments (Up to 4 GB/s, 10 Petabytes per year in 2008)

16/4/2004Storage Resource Sharing with CASTOR3 A LHC Experiment

16/4/2004Storage Resource Sharing with CASTOR4 The LHC Computing Grid grid for a physics study group Tier3 physics department    Desktop Germany Tier 1 USA UK France Italy China CERN Tier 1 Japan CERN Tier 0 Tier2 Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x grid for a regional group

16/4/2004Storage Resource Sharing with CASTOR5 CASTOR CERN Advanced STORage Manager –Hierarchical Storage Manager (HSM) used to store user and physics files –Manages the secondary and tertiary storage History –Development started in 1999 based on SHIFT, CERN's tape and disk management system since beginning of 1990s (SHIFT was awarded a 21st Century Achievement Award by Computerworld in 2001) –In production since the beginning of

16/4/2004Storage Resource Sharing with CASTOR6 CASTOR deployment CASTOR teams at CERN –Dev team (5) –Operations team (4) HW Setup at CERN –Disk servers ~ 370 disk servers ~ 300 TB of staging pools ~ 35 stagers (disk pool managers) –Tapes and Libraries ~ 90 tapes drives ( B) 2 sets of 5 Powderhorn silos (2 x cartridges) 1 Timberwolf (1 x 600 cartridges) 1 L700 (1 x 288 cartridges) Deployed in other HEP institutes –PIC Barcelona –CNAF Bologna –…

16/4/2004Storage Resource Sharing with CASTOR7 CASTOR Data Evolution Drop due to deletion of 0.75Million ALICE MDC4 files (1GB each) 1PB 2PB

16/4/2004Storage Resource Sharing with CASTOR8 CASTOR Architecture Application RFIO API CASTOR name server (logical name space) Stager (disk pool manager) rfiod (disk mover) Disk cache Volume and Drive Queue Manager (VDQM) (PVL) Remote Tape Copy (tape mover) tpdaemon (PVR) Volume Manager (VMGR) (PVL) Oracle or MySQL Central Services Tape SubsystemDisk Subsystem

16/4/2004Storage Resource Sharing with CASTOR9 Main Characteristics POSIX-like client interface –Use of RFIO in the HEP community Modular / Highly Distributed –A set of central servers –Disk subsystem –Tape subsystem Allows for tape resource sharing Grid Interfaces –GridFTP –Storage Resource Manager (V1.1) (Cooperation between CERN/FNAL/JLAB/LBNL, RAL; c.f.

16/4/2004Storage Resource Sharing with CASTOR10 Platform/Hardware Support Multiplatform support –Linux, Solaris, AIX, HP-UX, Digital UNIX, IRIX –The clients and some of the servers run on Windows NT/2K Supported drives –DLT/SDLT, LTO, IBM 3590, STK 9840, STK 9940A/B (and old drives already supported by SHIFT) Libraries –SCSI Libraries –ADIC Scalar, IBM 3494, IBM 3584, Odetics, Sony DMS24, STK Powderhorn (with ACSLS), STK L700 (with SCSI or ACSLS)

16/4/2004Storage Resource Sharing with CASTOR11 Requirements for LHC (1) CASTOR currently performs satisfactorily: –Alice Data Challenge: 300 MB/s for a week. 600 MB/s maintained for a half day. –High request load: 10s of thousands of requests per day per TB of disk cache. However, when LHC starts in 2007 –A single stager should scale up to 500/1000 requests per second –Expected system configuration – ~ 4PB of disk cache – 10 PB stored on tapes per year (peak rate of 4GB/s) – ~ disks, 150 tape drives

16/4/2004Storage Resource Sharing with CASTOR12 Requirements for LHC (2) Various types of Workload A few times per year -Complete data- set (up to 2 GB/s as well) Sequential RW Reconstruc tion Once per tier-Part of data setSequential Read Data to Tier 1 centers TIER 1 TIER 0 - Necessary to avoid loosing data Quality of Service ? Always runnning Once Frequency Random Analysis Up to 1 GB/s for Alice… (2 GB/s total) Sequential Write Data Recording Amount of Data Involved Access Pattern

16/4/2004Storage Resource Sharing with CASTOR13 Stager Central Role Application RFIO API CASTOR name server (logical name space) Stager (disk pool manager) rfiod (disk mover) Disk cache Volume and Drive Queue Manager (VDQM) (PVL) Remote Tape Copy (tape mover) tpdaemon (PVR) Volume Manager (VMGR) (PVL) Oracle or MySQL Central Services Tape SubsystemDisk Subsystem

16/4/2004Storage Resource Sharing with CASTOR14 Limitations of the current system The CASTOR stager has a crucial role: but the current version limits CASTOR performance –CASTOR stager catalogue limited in size (~ files in cache) –The Disks resources have to be dedicated to each stager which leads to: Sub-optimal use of disk resources Difficult configuration management. It is difficult to switch disk resources quickly when the workload changes –Not very resilient to disk server errors –Current stager design does not scale Tape mover API not flexible enough to allow dynamic scheduling of disk access

16/4/2004Storage Resource Sharing with CASTOR15 Vision... With clusters of 100s of disk and tape servers, the automated storage management faces more and more the same problems as CPU clusters management –(Storage) Resource management –(Storage) Resource sharing –(Storage) Request scheduling –Configuration –Monitoring The stager is the main gateway to all resources managed by CASTOR Vision: Storage Resource Sharing Facility

16/4/2004Storage Resource Sharing with CASTOR16 Garbage Collector New CASTOR Stager Architecture Migration Application RFIO/stage API Request Handler Recall LSF CASTOR tape archive components (VDQM, VMGR, RTCOPY) Maui Disk cache rfiod (disk mover) Resource Management Interface Request repository and file catalogue (Oracle or MySQL) Data (TCP) Control (TCP) 3 rd party Policy Engine

16/4/2004Storage Resource Sharing with CASTOR17 Database Centric Architecture Set of stateless multithreaded daemons accessing a DB –Request state stored in RDMS –Support for Oracle and MySQL –Allows for big stager catalogs with reasonable performance –Locking/transactions handled by the RDBMS Stager as scalable as the database it uses –Use of DB clustering solutions if necessary Easier administration –Standard backup procedures –less problems to restart in case of problems

16/4/2004Storage Resource Sharing with CASTOR18 Resource Scheduling Externalized Scheduling Interface –Currently developped for MAUI (version 3.2.4) ( LSF 5.1 (Platform Computing – –Leverage the advanced features from CPU scheduling: Fair share Advanced reservations Load balancing Accounting…  This will allow to: –Share of all disk resources, with a fair share between the LHC experiments –Exploit all resources at the maximum of their performance (avoid hotspots on disk servers…) –follow the evolution of scheduling systems…

16/4/2004Storage Resource Sharing with CASTOR19 Other improvements Improved Request Handling –Request throttling possible Improved security –Strong authentication using GSI or Kerberos Modified tape mover interface –Allows for just-in-time scheduling of the disk resources when copying to/from tape Controlled Access to disk resources –All user access to disk resources will have to be scheduled though the stager, so as to control the IO on disk servers

16/4/2004Storage Resource Sharing with CASTOR20 Conclusion Current stager at CERN has shown its limitations New CASTOR stager proposal aims for –A pluggable framework for intelligent and policy controlled file access scheduling –Evolvable storage resource sharing facility framework rather than a total solution –File access request running/control and local resource allocation delegated to disk servers Currently In development… – Proof of concept/prototype implemented