Tape write efficiency improvements in CASTOR Department CERN IT CERN IT Department CH-1211 Genève 23 Switzerland DSS Data Storage.

Slides:



Advertisements
Similar presentations
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR Status Alberto Pace.
Advertisements

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Bit preservation cost outlook: Cost for 10, 20, 30 years archive.
Hard Disks Low-level format- organizes both sides of each platter into tracks and sectors to define where items will be stored on the disk. Partitioning:
Project Management Summary Castor Development Team Castor Readiness Review – June 2006 German Cancio, Giuseppe Lo Presti, Sebastien Ponce CERN / IT.
CS 6560: Operating Systems Design
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS TSM CERN Daniele Francesco Kruse CERN IT/DSS.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Chapter 8: Local Area Networks: Internetworking. 2 Objectives List the reasons for interconnecting multiple local area network segments and interconnecting.
Hugo HEPiX Fall 2005 Testing High Performance Tape Drives HEPiX FALL 2005 Data Services Section.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
CASTOR Upgrade, Testing and Issues Shaun de Witt GRIDPP August 2010.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Disk and I/O Management
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
Experiences and Challenges running CERN's High-Capacity Tape Archive 14/4/2015 CHEP 2015, Okinawa2 Germán Cancio, Vladimír Bahyl
CERN IT Department CH-1211 Genève 23 Switzerland t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray,
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
Page 110/12/2015 CSE 30341: Operating Systems Principles Network-Attached Storage  Network-attached storage (NAS) is storage made available over a network.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 6 th July 2009.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
Windows Server 2003 硬碟管理與磁碟機陣列 林寶森
02/09/2010 Industrial Project Course (234313) Virtualization-aware database engine Final Presentation Industrial Project Course (234313) Virtualization-aware.
2-Dec Offline Report Matthias Schröder Topics: Scientific Linux Fatmen Monte Carlo Production.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Castor incident (and follow up) Alberto Pace.
The Big Three What are the three most common complaints we hear about testing?
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
Advances in Bit Preservation (since DPHEP’2015) 3/2/2016 DPHEP / WLCG Workshop1 Germán Cancio IT Storage Group CERN DPHEP / WLCG Workshop Lisbon, 3/2/2016.
1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.
CERN - IT Department CH-1211 Genève 23 Switzerland Tape Operations Update Vladimír Bahyl IT FIO-TSI CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland CCRC Tape Metrics Tier-0 Tim Bell January 2008.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Data architecture challenges for CERN and the High Energy.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
Container Database Management Zheng Liu, Sheng Liu CSE 534:Advanced computer networks.
Developments for tape CERN IT Department CH-1211 Genève 23 Switzerland t DSS Developments for tape CASTOR workshop 2012 Author: Steven Murray.
AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.
CERN IT Department CH-1211 Genève 23 Switzerland t Increasing Tape Efficiency Original slides from HEPiX Fall 2008 Taipei RAL f2f meeting,
Tape archive challenges when approaching Exabyte-scale CHEP 2010, Taipei G. Cancio, V. Bahyl, G. Lo Re, S. Murray, E. Cano, G. Lee, V. Kotlyar CERN IT-DSS.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1 Storing Data: Disks and Files Chapter 9. 2 Objectives  Memory hierarchy in computer systems  Characteristics of disks and tapes  RAID storage systems.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR status and development HEPiX Spring 2011, 4 th May.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Storage plans for CERN and for the Tier 0 Alberto Pace (and.
CERN - IT Department CH-1211 Genève 23 Switzerland Castor External Operation Face-to-Face Meeting, CNAF, October 29-31, 2007 CASTOR Overview.
CTA: CERN Tape Archive Rationale, Architecture and Status
CASTOR: possible evolution into the LHC era

Chapter 10: Mass-Storage Systems
Tape Drive Testing.
Tape Operations Vladimír Bahyl on behalf of IT-DSS-TAB
Experiences and Outlook Data Preservation and Long Term Analysis
CERN Lustre Evaluation and Storage Outlook
Chapter 12: Mass-Storage Structure
CTA: CERN Tape Archive Adding front-ends and back-ends Status report
How can a detector saturate a 10Gb link through a remote file system
Performance Measures of Disks
Ákos Frohner EGEE'08 September 2008
CTA: CERN Tape Archive Overview and architecture
MES Migration HGP Asia Knowledge Day 2017
Overview Continuation from Monday (File system implementation)
CASTOR: CERN’s data management system
Hard disk basics Prof:R.CHARLES SILVESTER JOE Departmet of Electronics St.Joseph’s College,Trichy.
Re- engineeniering.
Disk Scheduling The operating system is responsible for using hardware efficiently — for the disk drives, this means having a fast access time and disk.
Presentation transcript:

Tape write efficiency improvements in CASTOR Department CERN IT CERN IT Department CH-1211 Genève 23 Switzerland DSS Data Storage Services Primary author: MURRAY Steven (CERN) Co-authors: BAHYL Vlado (CERN), CANCIO German (CERN), CANO Eric (CERN), KOTLYAR Victor (Institute for High Energy Physics (RU)), LO PRESTI Giuseppe (CERN), LO RE Giuseppe (CERN) and PONCE Sebastien (CERN) The CERN Advanced STORage manager (CASTOR) is used to archive to tape the physics data of past and present physics experiments. The current size of the tape archive is approximately 61PB. For reasons of physical storage space, all of the tape resident data in CASTOR are repacked onto higher density tapes approximately every two years. The performance of writing files smaller than 2GB to tape is critical in order to repack all of the tape resident data within a period of no more than 1 year. Implementing the delayed flushing of the tape-drive data-buffer Implemented using immediate tape-marks from version 2 of the SCSI standard CERN worked with the developer of the SCSI tape-driver for Linux to implement support for immediate tape-marks Support for immediate tape-marks is now available as an official SLC5 Kernel-patch and is in the vanilla Linux-Kernel as of version Methodology used when modifying legacy code The legacy tape reader/writer is critical for the safe storage and retrieval of data Modifications to the legacy tape reader/writer were kept to a bare minimum It is difficult to test new code within the legacy tape reader/writer, therefore unit-tests were used to test the new code separately Used the unit-testing framework for C++ named CPPUnit Developed 189 unit-tests so far Tests range from simple object instantiation through to testing TCP/IP application-protocols Results Average file-size ≈ 290 MB Overall increase ≈ ×10 Efficiency increase ≈ ×3 for average file size Legacy tape transfer-manager Drive scheduler Legacy tape reader/writer 1. Mount tape 2. File info 3. Write header file 4. Flush buffer 5. Write user file 6. Flush buffer 7. Write trailer file 8. Flush buffer 9. Wrote file Original architecture before improved write efficiency – 3 flushes per file 3 flushes ≈ 5 seconds ≈ 1.2 GB that could have been written Architecture after first deployment – 1 flush per file Legacy tape transfer-manager Drive scheduler 1. Mount tape Legacy tape reader/writer 2. File info 3. Write header file 4. Write user file 5. Write trailer file 6. Flush buffer 7. Wrote file 1 flush ≈ seconds ≈ 400 MB that could have been written Minor modification to the legacy tape reader/writer Architecture of second deployment – 1 flush per N GB Legacy tape transfer-manager Drive scheduler 1. Mount tape Tape gateway 2. File info for N files For N files or data loop 3. Write header file 4. Write user file 5. Write trailer file End loop 6. Flush buffer 7. Wrote N files Protocol bridge Legacy tape reader/writer Legacy tape transfer-manager replaced by the tape gateway Unlike the legacy tape transfer- manager, multiple tape-gateways can be ran in parallel for redundancy After N files, seconds spent flushing is negligible Protocol bridge allows old and new installations to co-exist More efficient bulk-protocol