Akiya Miyamoto KEK 1 June 2016

Slides:



Advertisements
Similar presentations
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Advertisements

Background Info The UK Mirror Service provides mirror copies of data and programs from many sources all over the world. This enables users in the UK to.
A tool to enable CMS Distributed Analysis
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Experience of a low-maintenance distributed data management system W.Takase 1, Y.Matsumoto 1, A.Hasan 2, F.Di Lodovico 3, Y.Watase 1, T.Sasaki 1 1. High.
Computing plans for the post DBD phase Akiya Miyamoto KEK ILD Session ECFA May 2013.
Software Common Task Group Report Akiya Miyamoto KEK ALCPG09 30 September 2009.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
June 10, D0 Use of OSG D0 relies on OSG for a significant throughput of Monte Carlo simulation jobs, will use it if there is another reprocessing.
Computing Resources for ILD Akiya Miyamoto, KEK with a help by Vincent, Mark, Junping, Frank 9 September 2014 ILD Oshu City a report on work.
ILC DBD Common simulation and software tools Akiya Miyamoto KEK ILC PAC 14 December 2012 at KEK.
LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration.
Analysis trains – Status & experience from operation Mihaela Gheata.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
Software Overview Akiya Miyamoto KEK JSPS Tokusui Workshop December-2012 Topics MC production Computing reousces GRID Future events Topics MC production.
Summary of Software and tracking activities Tokusui Workshop December 2015 Akiya Miyamoto KEK.
Software Common Task Group Report Akiya Miyamoto ILC PAC POSTECH 3 November 2009.
Next Step Akiya Miyamoto, KEK 17-Sep-2008 ILD Asia Meeting.
The GridPP DIRAC project DIRAC for non-LHC communities.
Benchmarking Production Jan Strube (CERN) for the SiD Production team.
Computing Resources for ILD Akiya Miyamoto, KEK with a help by Vincent, Mark, Junping, Frank 9 September 2014 ILD Oshu City a report on work.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
The GridPP DIRAC project DIRAC for non-LHC communities.
Lcsim software: status and future plans ECFA-LC DESY May 30, 2013 Norman Graf (for the sim/reco group)
LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Using the Grid for the ILC Mokka and Marlin on the Grid ILC Software Meeting, Cambridge 2006.
Computing requirements for the experiments Akiya Miyamoto, KEK 31 August 2015 Mini-Workshop on ILC Infrastructure and CFS for Physics and KEK.
Data Formats and Impact on Federated Access
ILD MCProduction with ILCDirac
Status of BESIII Distributed Computing
Information Systems and Network Engineering Laboratory II
ILD Soft & Analysis meeting
Real Time Fake Analysis at PIC
L’analisi in LHCb Angelo Carbone INFN Bologna
ATLAS Use and Experience of FTS
Overview of the Belle II computing
U.S. ATLAS Grid Production Experience
ILD Soft & Analysis meeting
Added value of new features of the ATLAS computing model and a shared Tier-2 and Tier-3 facilities from the community point of view Gabriel Amorós on behalf.
Work report Xianghu Zhao Nov 11, 2014.
INFN-GRID Workshop Bari, October, 26, 2004
Brief summary of discussion at LCWS2010 and draft short term plan
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Status of Storm+Lustre and Multi-VO Support
1 VO User Team Alarm Total ALICE ATLAS CMS
Short update on the latest gLite status
Quality Control in the dCache team.
Job workflow Pre production operations:
Analysis Operations Monitoring Requirements Stefano Belforte
Simulation use cases for T2 in ALICE
Survey on User’s Computing Experience
Computing in ILC Contents: ILC Project Software for ILC
ILD Ichinoseki Meeting
R. Graciani for LHCb Mumbay, Feb 2006
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
The LHCb Computing Data Challenge DC06
Presentation transcript:

Akiya Miyamoto KEK 1 June 2016 ILD MC production Akiya Miyamoto KEK 1 June 2016 Topics: 1. Status of ILD production with ILDDirac 2. Monitoring network performance

Status of ILD MC production with ILCDirac

ILCDirac for ILD: ILDDirac ILD MC production for DBD used non-DIRAC GRID tools. Since DBD, ILD is moving to the ILCDIRAC based system. LFC  Dirac File catalog, meta data, MC production ILDDirac has been developed by C.Calancha for ILD MC Production using ILCDirac tool. By March 2016, ILD module for ILCDirac works well and processed many requests promptly and efficiently. Lots of efforts were made on improving and extending existing monitoring tools. First test using DD4hep completed successfully. 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

ILDDirac - 2 Unfortunately, Tino has left the group. Our task to do is (1) Take over Tino’s product, merging his tool into ILCDirac repository. (2) Get familiar with ILD MC production with ILDDirac, and improve tools if necessary (3) Test ddsim in ILD production chain Current status In the step (1), Tino’s code is committed to ILCDirac git branch by Andre, and AM is learning, testing, and debugging ILDDirac. 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

ILD production chain stdhep (~500MB ) DIRAC transformation initiates a job, by checking meta information attached to files. Data driven processing Retries automatically if failed. Meta keys for inputDataQuery. stdhep-split splitted ( ~ 50x10MB) Ex: stdhep split job Datatype, Energy, Machine, GenProcessID, GenProcessName, MachineParams, SelectedFile MC simulation of each splitted-file sim MC reconstruction with Overlay Same as defined for generator’s meta information. Some key may be redundant “SelectedFile” is added for ILDProduction to select a file.  Something equivalent to file name may be convenient. dst rec Merge dsts merged-dst Replication 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

Test production : 500 GeV ffH, Hmm Randomly pick up this process, because there was a user request. Using DBD soft, Mokka: v01-14-01-p00, Marlin: v01-16-p05. 4 polarization combination, each 10k events of STDHEP files. 4 STDHEP files  200 Splitted-STDHEP files. After some learning and with a help by Andre, MCSimulation: all 200 gen files were simulated successfully. ( Several failed jobs retried automatically by DIRAC) MCReconstruction: Many retry happened. About 20% of sim files were unable to reconstruct even after O(10) retries. 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

Log info of failed jobs : I108161 Input : 40 sim files. MCReconstuction 203 jobs were submitted before job submission was stopped manually. 9 sim files remained unprocessed. 106 rec jobs had been submitted to process these files. Segmentation violation in 39 jobs Nodes: 11 cern.ch, 6, desy.de, 13 gridpp.rl.ac.uk  Seg. violation even at major sites. Marlin exited with non 0 status 4 jobs at OSG site, 63 jobs at non-OSG. Sample error message: [lcio::Exception: lcio::DataNotAvailableException: LCEventImpl::getCollection: collection not in event:LumiCalCollection The LCIO file seems to have a corrupted EventHeader that doesn't list all the collections in the event properly Further investigation is on-going. ffH, Hmm many not be a good example ? ( some bug in reconstruction ?) Unable to create merged-DST yet. One erroneous event could be neglected, but 1 file loss can not. Not easy to increase gen statistics with the current scheme. 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

Other issues (wish list) Generator samples “Generator file as input to sim” is preferred, except very established generator processes, because same input to ILD and SiD, and full sim and fast sim. splitted stdhep files: should be removed when finished. ancestor-descendants relation ? those written on a tape based SRM ? Merged-DST creation include in ILDProductionChain Prepare web information of produced samples. The automated production is preferred. Standard data quality monitor Addition of more keys in DFC may be useful Develop a tool for DDSim and tests. 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

Monitoring network performance

Grid tools and network performance A tool to monitor a long term stability of performance has been developed GRID performance can vary day by day, which is not easy to follow for normal users. In order to monitor long term stability, Submit 4 dirac jobs twice a day randomly Each job runs at KEK, PNNL, DESY, CERN Worker Nodes and download files from SEs at KEK, PNNL, DESY and CERN Daily plots at http://www-jlc.kek.jp/~miyamoto/hibi/sumplot 1 June, 2016 Akiya Miyamoto, ECFA LC 2016

Worker Nodes at KEK downloads 500 MB from a Storage Element KEK SE KEK WN PNNL SE KEK WN averaged time File transfer time (sec) A bug in handling job error averaged time Job run date CERN SE KEK WN DESY SE KEK WN SINET5 service start ! ●: copy failed

Downloading to DESY WNs KEK SE DESY WN PNNL SE DESY WN CERN SE DESY WN DESY SE DESY WN

Download to CERN WNs KEK SE CERN WN PNNL SE CERN WN CERN SE CERN WN DESY SE CERN WN

Average file transfer speed ( unit: sec ) 500MB file size From(SE) To(WN) kek pnnl desy cern 16.7 59.6 157.7 129.4 125.0 37.2 94.5 153.4 130.2 113.3 16.3 33.9 337.5 474.3 53.5 65.5 (2016.1.26 - 2016.5.26) Note: - Asymmetry: ** PNNL  KEK is faster than KEK PNNL by a factor 2 ( improved since 2015 ) PNNL PNNL, CERNCERN is slower than KEKKEK and DESYDESY - KEK to CERN fluctuate a lot and not as as good as CERN to KEK. ( PNNL to CERN as well. )

Summary C.Calancha developed ILD MC production tool with ILCDirac and produced ILD official samples successfully. A new team is being formed to take over his work. Learning, testing, and improving of ILDDirac has started. ILDDirac will be used to produce samples for ILD physics studies and detector optimization. Network monitoring tool has been developed to monitor day-by-day peprformance.