Recovery of Lost Files Jiří Chudoba 26.1.2007 Institute of Physics, Prague.

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

Help File For User Creation Click the “Course” button for Creating/Add User.
ATLAS T1/T2 Name Space Issue with Federated Storage Hironori Ito Brookhaven National Laboratory.
1 CHEP 2000, Roberto Barbera Tests of data management services in EDG 1.2 ALICE Off-line Week,
Collections Management Museums EMu – Data Cleaning with EMu Data Cleaning with EMu Warren Hindley.
User Help Manual SysCleaner. 1.Click on this screen or wait for few seconds to launch the main window.
Grid Data Management Assaf Gottlieb - Israeli Grid NA3 Team EGEE is a project funded by the European Union under contract IST EGEE tutorial,
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
ATLAS GridKa Cloud F2F meeting in Prague Jiří Chudoba Institute of Physics, Prague Monthly Meeting,
Ninth EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE gLite Data Management System Yaodong Cheng CC-IHEP, Chinese Academy.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Data management at T3s Hironori Ito Brookhaven National Laboratory.
ATLAS DC2 Pile-up Jobs on LCG Atlas DC Meeting February 2005.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
08/30/05GDM Project Presentation Lower Storage Summary of activity on 8/30/2005.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Nov. 18, EGEE and gLite are registered trademarks gLite Middleware Usage Dusan.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE middleware: gLite Data Management EGEE Tutorial 23rd APAN Meeting, Manila Jan.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Monitoring Tools Alexandre Duarte CERN.
Jan 31, 2006 SEE-GRID Nis Training Session Hands-on V: Standard Grid Usage Dušan Vudragović SCL and ATLAS group Institute of Physics, Belgrade.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
June 22, 2007USATLAS T2-T3 DQ2 0.3 SiteServices Patrick McGuigan
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
INFSO-RI Enabling Grids for E-sciencE Experiences with LFC and comparison with RNS Erwin Laure Jean-Philippe.
DPM Python tools Ivan Calvet IT/SDC-ID DPM Workshop 10 th October 2014.
INFSO-RI Enabling Grids for E-sciencE Αthanasia Asiki Computing Systems Laboratory, National Technical.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
SEE-GRID-SCI Storage Element Installation and Configuration Branimir Ackovic Institute of Physics Serbia The SEE-GRID-SCI.
INFSO-RI Enabling Grids for E-sciencE ATLAS DDM Operations - II Monitoring and Daily Tasks Jiří Chudoba ATLAS meeting, ,
ATLAS GridKa Cloud F2F meeting in Prague Jiří Chudoba Institute of Physics, Prague Prague,
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
NorduGrid plans and questions for gLite Marko Niinimaki, NorduGrid 3 rd EGEE meeting Athens, April 2005.
SEE-GRID-SCI Hands-On Session: Using Grid Vladimir Slavnic Institute of Physics, Belgrade Serbia The SEE-GRID-SCI initiative.
14-May-2003 AWG FH, JT, JJB DataGrig Barcelona 1 HEP GRID use cases Common GRID use cases F.Harris, J.Templon, J.J Blaising.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Real Life Examples Tickets – Real life examples Mário David LIP - Lisbon.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
Materials for Report about Computing Jiří Chudoba x.y.2006 Institute of Physics, Prague.
Testing the HEPCAL use cases J.J. Blaising, F. Harris, Andrea Sciabà GAG Meeting April,
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VO auger experience with large scale simulations on the grid Jiří Chudoba.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
Jiri Chudoba for the Pierre Auger Collaboration Institute of Physics of the CAS and CESNET.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Data Management Hands-on Juan Eduardo Murrieta.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
12th EELA Tutorial for Users and Managers E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
28 Nov 2007 Alessandro Di Girolamo 1 A “Hands On” overview of the ATLAS Distributed Data Management Disclaimer & Special Thanks Things are changing (of.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) Algiers, EUMED/Epikh Application Porting Tutorial, 2010/07/04.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) LFC Installation and Configuration Dong Xu IHEP,
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Grid Data Management Assaf Gottlieb Tel-Aviv University assafgot tau.ac.il EGEE is a project funded by the European Union under contract IST
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Data Management Maha Metawei
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America LFC Server Installation and Configuration.
Planning a Disaster Recovery System
Simone Campana CERN IT-ES
gLite Data management system overview
David Adams Brookhaven National Laboratory September 28, 2006
Data Management in Release 2
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Data Management Ouafa Bentaleb CERIST, Algeria
Data services in gLite “s” gLite and LCG.
Data Management in LHCb: consistency, integrity and coherence of data
DIRAC Data Management: consistency, integrity and coherence of data
Status and plans for bookkeeping system and production tools
Presentation transcript:

Recovery of Lost Files Jiří Chudoba Institute of Physics, Prague

Scope Lost physical files on disks/tapes due to hw crashes, human errors,... System Administrator provides a list of lost files  Production files and some users’ files names follow a convention (more than 1)  Files not following a naming convention not covered here

Steps remove from the SE DB (system admin) delete lost entries from an LFC catalogue locate a replica  exists:  replicate  does not exist:  correct dataset (delete lost files)  pass list to prodsys

Update LFC: 1. Find LFN /pnfs/grid.sara.nl/data/atlas/misal1_csc11/misal1_csc PythiaWta unu.simul.HITS.v _tid004341/misal1_csc PythiaWta unu.simul.HITS.v _tid _00380.pool.root.2 lfn:/grid/atlas/dq2/misal1_csc11/HITS/misal1_csc PythiaWtaunu. simul.HITS.v _tid004341/misal1_csc PythiaWtaunu.s imul.HITS.v _tid _00380.pool.root.2 /pnfs/grid.sara.nl/data/atlas/calib0_csc11/calib0_csc J2_pythia_j etjet.simul.HITS.v _tid004283/calib0_csc J2_pythia_j etjet.simul.HITS.v _tid _00474.pool.root.9 /grid/atlas/dq2/calib0_csc11/calib0_csc J2_pythia_jetjet.simul.H ITS.v _tid004283/calib0_csc J2_pythia_jetjet.simul.H ITS.v _tid _00474.pool.root.9

Update LFC lcg-lg --vo atlas lfn:/grid/atlas/dq2/misal1_csc11/HITS/misal1_csc PythiaWtaunu.simul.HITS.v _tid004341/misal1_csc PythiaWtaunu.simul.HITS.v _tid _ pool.root.2 guid:52D6E5C4-B788-DB11-AD6C A052E lcg-uf 52D6E5C4-B788-DB11-AD6C A052E srm://srm.grid.sara.nl/pnfs/grid.sara.nl/data/atlas/misa l1_csc11/misal1_csc PythiaWtaunu.simul.HITS.v _tid004341/misal1_csc PythiaWtaunu.simul. HITS.v _tid _00380.pool.root.2

Find replicas, replicate lcg-lr --vo atlas lfn:/grid/atlas/dq2/misal1_csc11/HITS/misal1_csc Pythi aWtaunu.simul.HITS.v _tid004341/misal1_csc Pyth iaWtaunu.simul.HITS.v _tid _00380.pool.root.2 srm://se2.itep.ru/dpm/itep.ru/home/atlas/dq2/misal1_csc11/HITS/ misal1_csc PythiaWtaunu.simul.HITS.v _tid /misal1_csc PythiaWtaunu.simul.HITS.v _tid _00380.pool.root.2 lcg-rep -t d srm://srm.grid.sara.nl/pnfs/grid.sara.nl/data/atlas/misal1_csc1 1/misal1_csc PythiaWtaunu.simul.HITS.v _tid /misal1_csc PythiaWtaunu.simul.HITS.v _tid _00380.pool.root.2 srm://se2.itep.ru/dpm/itep.ru/home/atlas/dq2/misal1_csc11/HITS/ misal1_csc PythiaWtaunu.simul.HITS.v _tid /misal1_csc PythiaWtaunu.simul.HITS.v _tid _00380.pool.root.2 Simple shell script to loop over all files, all 39 files replicated It was easy because all replicas were in the same cloud

NIKHEF case 3604 files lost list of file names provided /dpm/nikhef.nl/home/atlas/dq2/calib0/calib J2_pythia_jetjet.simul.HITS.v _tid003287/calib J2_pythia_jetjet.simul.HITS.v _tid _00417.pool.root.14 to get LFN: sed 's#/dpm/nikhef\.nl/home#/grid#' lost_files.list > lost_files.lfc.list  For some files need to add dq2: sed 's#/grid/atlas/\([^dq2][^/]*\)/#/grid/atlas/dq2/\1/#' lost_files.lfc.list > lost_files.lfc.with_dq2.list lfc-ls $FN  18 minutes over 3604 files (3.3 Hz), all files found in LFC lcg-lg lfn:$LFN  21 minutes files were unregistered (loop in a shell)

Jiahang’s approach decode DS name and File name: /dpm/nikhef.nl/home/atlas/dq2/calib0/calib J2_pythia_jetjet.simul.HITS.v _tid003287/calib J 2_pythia_jetjet.simul.HITS.v _tid _00417.pool.root.14 find all sites with DS replicas dq2.listFilesInDataset find guid’s for lost files 3 minutes for 3604 files python script checknum.py in CVS fast, but many files were not found in DQ2

Jiahang’s approach (con’t) lfccheck.py  loop over sites (LFC’s) where DS is registered and find registered replicas  lfc_getreplica(‘’,guid,se_host) Remove affected site from DQ2 DS registration Re-subscribe Clean DQ2 db if no replica exists

Work in progress... short shell scripts capable to remove entries from LFC and replicate lost files  slow, but OK for short list if files  not general  not capable to replicate from other Grids more general and faster scripts in python are being developed