CERN Data Services Update HEPiX 2004 / NeSC Edinburgh Data Services team: Vladimír Bahyl, Hugo Caçote, Charles Curran, Jan van Eldik, David Hughes, Gordon.

Slides:



Advertisements
Similar presentations
ESLEA and HEPs Work on UKLight Network. ESLEA Exploitation of Switched Lightpaths in E- sciences Applications Exploitation of Switched Lightpaths in E-
Advertisements

Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
26/05/2004HEPIX, Edinburgh, May Lemon Web Monitoring Miroslav Šiket CERN IT/FIO
Report on CVS Services : Central and LCG-dedicated services CERN IT/PS/UI May 2004.
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
GridKa January 2005 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann 1 Mass Storage at GridKa Forschungszentrum Karlsruhe GmbH.
Managing managed storage CERN Disk Server operations HEPiX 2004 / BNL Data Services team: Vladimír Bahyl, Hugo Caçote, Charles Curran, Jan van Eldik, David.
XenData SXL-5000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 210 TB to 1.18 PB, designed for the demanding.
ORACLE WebDB 2.2 Montse Collados Polidura SL/CO - April 2000.
Deutsches Zentrum für Luft- und Raumfahrt e.V. Bench mark study for new technology archiving devices H.-J. Wolf K.-D Mißling, G. M.Pinna CEOS Subgroup.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Site report: CERN Helge Meinhard (at) cern ch HEPiX fall SLAC.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Backup Rationalisation Reorganisation of the CERN Computer Centre Backups David Asbury IT/DS Friday 6 December 2002.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Site report: CERN Helge Meinhard (at) cern ch HEPiX spring CASPUR.
CASPUR Site Report Andrei Maslennikov Sector Leader - Systems Catania, April 2001.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR Operational experiences HEPiX Taiwan Oct Miguel Coelho dos Santos.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
Tier1 Status Report Martin Bly RAL 27,28 April 2005.
20-22 September 1999 HPSS User Forum, Santa Fe CERN IT/PDP 1 History  Test system HPSS 3.2 installation in Oct 1997 IBM AIX machines with IBM 3590 drives.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 6 th July 2009.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
WLCG Service Report ~~~ WLCG Management Board, 1 st September
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
CERN - IT Department CH-1211 Genève 23 Switzerland The Tier-0 Road to LHC Data Taking CPU ServersDisk ServersNetwork FabricTape Drives.
CASTOR: CERN’s data management system CHEP03 25/3/2003 Ben Couturier, Jean-Damien Durand, Olof Bärring CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
UK Tier 1 Centre Glenn Patrick LHCb Software Week, 28 April 2006.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
GGUS summary (4 weeks) VOUserTeamAlarmTotal ALICE1102 ATLAS CMS LHCb Totals
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
Operation of the CERN Managed Storage environment; current status and future directions CHEP 2004 / Interlaken Data Services team: Vladimír Bahyl, Hugo.
CERN - IT Department CH-1211 Genève 23 Switzerland Operations procedures CERN Site Report Grid operations workshop Stockholm 13 June 2007.
01. December 2004Bernd Panzer-Steindel, CERN/IT1 Tape Storage Issues Bernd Panzer-Steindel LCG Fabric Area Manager CERN/IT.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Storage & Database Team Activity Report INFN CNAF,
RAL Plans for SC2 Andrew Sansum Service Challenge Meeting 24 February 2005.
Site Report: CERN Helge Meinhard / CERN-IT HEPiX, Jefferson Lab 09 October 2006.
Dissemination and User Feedback Castor deployment team Castor Readiness Review – June 2006.
Managing Large Linux Farms at CERN OpenLab: Fabric Management Workshop Tim Smith CERN/IT.
GDB Meeting 12. January Bernd Panzer-Steindel, CERN/IT 1 Mass Storage at CERN GDB meeting, 12. January 2005.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
15.June 2004Bernd Panzer-Steindel, CERN/IT1 CERN Mass Storage Issues.
System Monitoring with Lemon
Tape Operations Vladimír Bahyl on behalf of IT-DSS-TAB
Service Challenge 3 CERN
Experiences and Outlook Data Preservation and Long Term Analysis
Castor services at the Tier-0
Olof Bärring LCG-LHCC Review, 22nd September 2008
Bernd Panzer-Steindel CERN/IT
CASTOR: CERN’s data management system
The Problem ~6,000 PCs Another ~1,000 boxes But! Affected by:
Lee Lueking D0RACE January 17, 2002
Presentation transcript:

CERN Data Services Update HEPiX 2004 / NeSC Edinburgh Data Services team: Vladimír Bahyl, Hugo Caçote, Charles Curran, Jan van Eldik, David Hughes, Gordon Lee, Tony Osborne, Tim Smith

2004/05/26CERN Data Services: of 19 Outline Data Services Drivers Disk Service Migration to Quattor / LEMON Future directions Tape Service Media migration Future directions Grid Data Services

2004/05/26CERN Data Services: of 19 Data Flows Tier-0 / Tier-1 for the LHC Data Challenges: CMSDC04 (finished); PCP05 (Autumn)+80; +170 ALICE ongoing+137 TB LHCb ramping up+40 TB ATLAS ramping up+60 TB Fixed Target Programme: NA48at 80 MB/s+200 TB COMPASSat 70 MB/s (peak 120)+625 TB nToFat 45 MB/s+180 TB NA60at 15 MB/s+60 TB Testbeamsat 1~5 MB/s (x 5) Analysis…

2004/05/26CERN Data Services: of 19 Disk Server Functions

2004/05/26CERN Data Services: of 19 Generations 0 th Jumbos 1 st & 2 nd 4U 3 rd & 4 th 8U

2004/05/26CERN Data Services: of 19 Warrantees

2004/05/26CERN Data Services: of 19 Disk Servers: Jan EIDE Disk Servers Commodity Storage in a box 544 TB of disk capacity 6700 spinning disks Storage Configuration HW Raid-1mirrored for maximum reliability ext2 file systems Operating systems RH6.1, 6.2, 7.2, 7.3, RHES 13 different kernels Application uniformity; CASTOR SW

2004/05/26CERN Data Services: of 19 Quattor-ising Motivation: Scale Uniformity; Manageability; Automation Configuration Description (into CDB) HW and SW; nodes and services Reinstallation Production machines – min service interruption! Eliminate peculiarities from CASTOR nodes MySQL, web servers Refocus root control Quiescing a disk server draining a batch node! Gigabit cards gymnastics (ext2 -> ext3) Complete (except 10 RH6 boxes for Objectivity)

2004/05/26CERN Data Services: of 19 LEMON-ising MSA everywhere Linux box monitoring and alarms Automatic HW static checks Adding CASTOR server specific Service monitoring HW Monitoring lm_sensors (see tape section) smartmontools smartd deployment Kernel issues; firmware bugs; through 3ware controller smart_ctl auto checks; predictive monitoring IPMI investigations; especially remote access Remote reset/power-on/power-off

2004/05/26CERN Data Services: of 19 Disk Replacement Failure rate unacceptably high 10 months to be believed 4 weeks to execute 1224 disks exchanged (out of 6700) And the cages Western Digital; type DUA Head instabilities

2004/05/26CERN Data Services: of 19 Disk Storage Futures EIDE Commodity storage in a box Production systems HW Raid-1 / ext3 Pilots (15 production systems) HW Raid-5 + SW Raid-0 / XFS (See Jan Ivens talk next) New tenders out… 30TB SATA in a box 30TB external SATA disk arrays New CASTOR stager (see Olofs talk)

2004/05/26CERN Data Services: of 19 Tape Service 70 tape servers (Linux) (mostly) Single FibreChannel attached drives 2 symmetric robotic installations 5 x STK 9310 Silos in each Drives Media

2004/05/26CERN Data Services: of 19 Tape Server Temperatures lm_sensors package General SMBus access and hardware monitoring. Used to access LM87 chip Fan speeds Voltages Int/Ext temperatures ADM1023 chip Int/Ext temperatures

2004/05/26CERN Data Services: of 19 Tape Server Temperatures

2004/05/26CERN Data Services: of 19 Media Migration To 9940B (mainly from 9940A) 200GB – extra capacity avoids unnecessary acquisitions Better performance – though hard to benefit in normal chaotic mode Reduced errors; fewer interventions 1-2% of A tapes can not be read (extremely slow) on B drives Have not been able to return all A-drives

2004/05/26CERN Data Services: of 19 Tape Service Developments Removing tails… Tracking of all tape errors (18 months) Retiring of problematic media Proactive retiring of heavily used media (>5000 mounts) repack on new media Checksums Populated writing to tape Verified loading back to disk 22% already after few weeks

2004/05/26CERN Data Services: of 19 Water Cooled Tapes! Plumbing error! 5000 tapes disabled for a few days 550 superficially wet 152 seriously wet – visually inspected

2004/05/26CERN Data Services: of 19 Tape Storage Futures Commodity drive studies LTO-2 (Collaboratively CASPUR/Valencia) Test and evaluate High-end drives IBM 3592 STK NGD Other STK offerings SL8500 robotics and silos Indigo; managed storage, tape virtualisation

2004/05/26CERN Data Services: of 19 GRID Data Management GridFTP + SRM servers (Former) Standalone / experiment dedicated Hard to intervene; not scalable New load-balanced 6 node Service castorgrid.cern.ch SRM modifications to support operate behind load balancer GridFTP standalone client Retire ftp and bbftp access to CASTOR