1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.

Slides:



Advertisements
Similar presentations
Jens G Jensen Atlas Petabyte store Supporting Multiple Interfaces to Mass Storage Providing Tape and Mass Storage to Diverse Scientific Communities.
Advertisements

GridKa May 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Installing dCache into an existing Storage environment at GridKa Forschungszentrum.
Steve Traylen Particle Physics Department Experiences of DCache at RAL UK HEP Sysman, 11/11/04 Steve Traylen
GridKa January 2005 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann 1 Mass Storage at GridKa Forschungszentrum Karlsruhe GmbH.
Fermilab Mass Storage System Gene Oleynik Integrated Administration, Fermilab.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
CD central data storage and movement. Facilities Central Mass Store Enstore Network connectivity.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
The Mass Storage System at JLAB - Today and Tomorrow Andy Kowalski.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
DCache at Tier3 Joe Urbanski University of Chicago US ATLAS Tier3/Tier2 Meeting, Bloomington June 20, 2007.
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
Lower Storage projects Alexander Moibenko 02/19/2003.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
08/30/05GDM Project Presentation Lower Storage Summary of activity on 8/30/2005.
Fermi National Accelerator Laboratory SC2006 Fermilab Data Movement & Storage Multi-Petabyte tertiary automated tape store for world- wide HEP and other.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Author - Title- Date - n° 1 Partner Logo WP5 Summary Paris John Gordon WP5 6th March 2002.
Remote Control Room and SAM DH Shifts at KISTI for CDF Experiment 김현우, 조기현, 정민호 (KISTI), 김동희, 양유철, 서준석, 공대정, 김지은, 장성현, 칸 아딜 ( 경북대 ), 김수봉, 이재승, 이영장, 문창성,
 CASTORFS web page - CASTOR web site - FUSE web site -
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
BNL Facility Status and Service Challenge 3 HEPiX Karlsruhe, Germany May 9~13, 2005 Zhenping Liu, Razvan Popescu, and Dantong Yu USATLAS/RHIC Computing.
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
Jens G Jensen RAL, EDG WP5 Storage Element Overview DataGrid Project Conference Heidelberg, 26 Sep-01 Oct 2003.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
OSG Abhishek Rana Frank Würthwein UCSD.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Oct 24, 2002 Michael Ernst, Fermilab DRM for Tier1 and Tier2 centers Michael Ernst Fermilab February 3, 2003.
Storage and Data Movement at FNAL D. Petravick CHEP 2003.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
CASTOR project status CASTOR project status CERNIT-PDP/DM October 1999.
GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
Tackling I/O Issues 1 David Race 16 March 2010.
Hans Wenzel PMG Meeting Friday Dec. 13th 2002 ”T1 status" Hans Wenzel Fermilab  near term vision for: data management and farm configuration management.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
J Jensen/J Gordon RAL Storage Storage at RAL Service Challenge Meeting 27 Jan 2005.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Martina Franca (TA), 07 November Installazione, configurazione, testing e troubleshooting di Storage Element.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
STORAGE EXPERIENCES AT MWT2 (US ATLAS MIDWEST TIER2 CENTER) Aaron van Meerten University of Chicago Sarah Williams Indiana University OSG Storage Forum,
CASTOR: possible evolution into the LHC era
Scalable sync-and-share service with dCache
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Quick Look on dCache Monitoring at FNAL
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Enabling High Speed Data Transfer in High Energy Physics
CASTOR: CERN’s data management system
Lee Lueking D0RACE January 17, 2002
Presentation transcript:

1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab

2 5/4/05 What are they? ● Enstore – In-house, manages files, tape volumes, tape libraries – End-user direct interface to files on tape ● dCache – Joint DESY and Fermilab, disk caching front-end – End user interface to read cached files, write files to enstore indirectly via dCache ● SRM – Provides a consistent interface to underlying storage systems.

3 5/4/05 Software ● 3 production systems of Enstore – RunII: ● D0 ● CDF – Everyone else: ● MINOS, MiniBooNE, SDSS, CMS, et. al. ● 3 production systems of dCache and SRM – CDF – CMS – Everyone else

4 5/4/05 Requirements ● Scalability ● Performance ● Availability ● Data Integrity

5 5/4/05 PNFS ● Provides a hierarchical namespace for users' files in Enstore. ● Manages file metadata. ● Looks like an NFS mounted file system from user nodes. ● Stands for “Perfectly Normal File System.” ● Written at DESY.

6 5/4/05 Enstore Design ● Divided into a number of server processes – Scalability is achieved by spreading these servers across multiple nodes. – If a node goes down, we can modify the configuration to run that nodes servers on a different node. This increases availability while the broken node is fixed. ● Enstore User Interface: encp – Similar to standard UNIX cp(1) command – encp /data/myfile1 /pnfs/myexperiment/myfile1

7 5/4/05 Hardware ● Robots – 6 StorageTek Powderhorn Silos – 1 ADIC AML/2 ● Tape Drives: – LTO: 9 – LTO2: 14 – 9940: 20 – 9940B: 52 – DLT (4000 & 8000): 8 ● 127 commodity Linux PCs

8 5/4/05 Enstore Monitoring ● Web pages for current server statuses ● Cron Jobs ● Plots for resource usage – Number of tapes written – Number of tape drives in use – Number of mounts – And much more... ● entv (ENstore TV)

9 5/4/05 Enstore Monitoring Cont. ● X-Axis is time since January 1 st 2005 until present ● Y-Axis is number of gigabytes written ● Includes summary of tapes written in last month and week

10 5/4/05 ENTV ● Real time animation ● Client nodes ● Tape & drive information – Current Tape – Instantaneous Rate

11 5/4/05 By the Numbers ● User Data on Tape: – 2.6 Petabytes ● Number of files on tape: – 10.8 million ● Number of volumes: – ~25,000 ● One Day Transfer Record – 27 Terabytes

12 5/4/05 Performance: 27TB ● Two days of record transfer rate ● CMS Service Challenge in March (In red) ● Normal usage

13 5/4/05 Lessons Learned ● Just because the file transferred without error, does not guarantee that everything is fine. – With Fermilab's load we see bit error corruption. ● Users will push the system to its limits. – Record 27TB transfer days were not even noticed for three days. ● Just having a lot of logs, alarms and plots is not enough. They must also be interpretable.

14 5/4/05 dCache ● Works on top of Enstore or as standalone configuration. ● Provides a buffer between the user and tape. ● Improves performance for 'popular' files by avoiding the need of reading from tape every time a file is needed. ● Scales as nodes (and disks) are added.

15 5/4/05 User Access to Data in dCache ● srm – storage resource manager – srmcp ● gridftp – globus_url_copy ● kerberizedftp ● weakftp ● dcap – native dCache protocol – dccp ● http – wget, web browers

16 5/4/05 dCache Deployment ● Administrative Node ● Monitoring Node ● Door Nodes – Control channel communication ● Pool Nodes – Data channel communication – ~100 pool nodes with ~225 Terabytes of disk

17 5/4/05 dCache Performance Record transfer day of 60GB. This is for just one dCache system.

18 5/4/05 Lessons Learned ● Use the XFS filesystem on the pool disks. ● Use direct I/O when accessing the files on the local dCache disk. ● Users will push the system to its limits. Be prepared.

19 5/4/05 Storage Resource Manager ● Provides uniform interface for access to multiple storage systems via SRM protocol. ● SRM is a broker that works on top of other storage systems. – dCache ● Runs as a server within the dCache. – UNIX TM filesystem ● Standalone – Enstore ● In development

20 5/4/05 CMS Service Challenge ● 50 MB/s sustained transfer rate – From CERN, though the dCache to tape in Enstore – On top of normal daily usage of 200 to 400 MB/s – Rate throttled to 50 MB/s ● 700 MB/s sustained transfer rate – From CERN to dCache disk

21 5/4/05 Conclusions ● Scalability ● Performance ● Availability – Modular design ● Data Integrity – Bit errors detected from scans. Requirements are achieved.