INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Geant4 Physics Validation: Use of the GRID Resources Patricia Mendez Lorenzo CERN (IT-GD)

Slides:



Advertisements
Similar presentations
DataTAG WP4 Meeting CNAF Jan 14, 2003 Interfacing AliEn and EDG 1/13 Stefano Bagnasco, INFN Torino Interfacing AliEn to EDG Stefano Bagnasco, INFN Torino.
Advertisements

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Randall Sobie The ATLAS Experiment Randall Sobie Institute for Particle Physics University of Victoria Large Hadron Collider (LHC) at CERN Laboratory ATLAS.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
INFSO-RI Enabling Grids for E-sciencE The Grid Challenges in LHC Service Deployment Patricia Méndez Lorenzo CERN (IT-GD) Linköping.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
A tool to enable CMS Distributed Analysis
Les Les Robertson WLCG Project Leader WLCG – Worldwide LHC Computing Grid Where we are now & the Challenges of Real Data CHEP 2007 Victoria BC 3 September.
Exploiting the Grid to Simulate and Design the LHCb Experiment K Harrison 1, N Brook 2, G Patrick 3, E van Herwijnen 4, on behalf of the LHCb Grid Group.
Les Les Robertson LCG Project Leader LCG - The Worldwide LHC Computing Grid LHC Data Analysis Challenges for 100 Computing Centres in 20 Countries HEPiX.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Frédéric Hemmer, CERN, IT DepartmentThe LHC Computing Grid – October 2006 LHC Computing and Grids Frédéric Hemmer IT Deputy Department Head October 10,
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
Frédéric Hemmer, CERN, IT Department The LHC Computing Grid – June 2006 The LHC Computing Grid Visit of the Comité d’avis pour les questions Scientifiques.
F.Fanzago – INFN Padova ; S.Lacaprara – LNL; D.Spiga – Universita’ Perugia M.Corvo - CERN; N.DeFilippis - Universita' Bari; A.Fanfani – Universita’ Bologna;
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Geant4 Acceptance Suite for Key Observables CHEP06, T.I.F.R. Mumbai, February 2006 J. Apostolakis, I. MacLaren, J. Apostolakis, I. MacLaren, P. Mendez.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
INFSO-RI Enabling Grids for E-sciencE Project Gridification: the UNOSAT experience Patricia Méndez Lorenzo CERN (IT-PSS/ED) CERN,
IST E-infrastructure shared between Europe and Latin America High Energy Physics Applications in EELA Raquel Pezoa Universidad.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Ian Bird LCG Deployment Manager EGEE Operations Manager LCG - The Worldwide LHC Computing Grid Building a Service for LHC Data Analysis 22 September 2006.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
1 The LHC Computing Grid – February 2007 Frédéric Hemmer, CERN, IT Department LHC Computing and Grids Frédéric Hemmer Deputy IT Department Head January.
CERN IT Department CH-1211 Genève 23 Switzerland Visit of Professor Karel van der Toorn President University of Amsterdam Wednesday 10 th.
3 rd EGEE Conference Athens 18th-22nd April EGEE is a project funded by the European Union under contract IST Geant4 Production in the LCG.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
INFSO-RI Enabling Grids for E-sciencE Porting Scientific Applications on GRID: CERN Experience Patricia Méndez Lorenzo CERN (IT-PSS/ED)
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Service, Operations and Support Infrastructures in HEP Processing the Data from the World’s Largest Scientific Machine Patricia Méndez Lorenzo (IT-GS/EIS),
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Certification and test activity ROC/CIC Deployment Team EGEE-SA1 Conference, CNAF – Bologna 05 Oct
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CRAB: the CMS tool to allow data analysis.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The LCG interface Stefano BAGNASCO INFN Torino.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
1 The LHC Computing Grid – April 2007 Frédéric Hemmer, CERN, IT Department The LHC Computing Grid A World-Wide Computer Centre Frédéric Hemmer Deputy IT.
INFSO-RI Enabling Grids for E-sciencE UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Méndez Lorenzo.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Experiences on Grid production for Geant4 EGEE User Forum, CERN, 1st March 2006 P. Mendez Lorenzo, A. Ribon CERN CERN.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
INFN/IGI contributions Federated Clouds Task Force F2F meeting November 24, 2011, Amsterdam.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.
GDB Meeting CERN 09/11/05 EGEE is a project funded by the European Union under contract IST A new LCG VO for GEANT4 Patricia Méndez Lorenzo.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
Patricia Méndez Lorenzo ALICE Offline Week CERN, 13th July 2007
The LHC Computing Challenge
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Simulation use cases for T2 in ALICE
LHC Data Analysis using a worldwide computing grid
Testing Geant4 with a simplified calorimeter setup
Gridifying the LHCb Monte Carlo production system
Overview & Status Al-Ain, UAE November 2007.
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Geant4 Physics Validation: Use of the GRID Resources Patricia Mendez Lorenzo CERN (IT-GD) / CNAF Geant4 Bio-Medical Developments Geant4 Physics Validation INFN Genova, July 2005

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Outlook Introduction to the LCG Geant4 in LCG First Geant4 Productions Results and Summary Future Plans

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo What is the LCG? The LHC: Generation of 40 million particle collisions (events) per second at the center of each for experiments Reduce by online computers that filter out a few hundred good events per sec Recorded on disk and magnetic tape at MB/sec: 15 PB/year Here it is where the GRID environment comes in

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo LCG Service Hierarchy Tier-0 – the accelerator centre Data acquisition and initial processing Distribution of data to the different Tier’s Canada – Triumf (Vancouver) France – IN2P3 (Lyon) Germany – Forschunszentrum Karlsruhe Italy – CNAF (Bologna) Netherlands – NIKHEF/SARA (Amsterdam) Nordic countries – distributed Tier-1 Spain – PIC (Barcelona) Taiwan – Academia SInica (Taipei) UK – CLRC (Oxford) US – FermiLab (Illinois) – Brookhaven (NY) Tier-1 – “online” to the data acquisition process  high availability Managed Mass Storage –  grid-enabled data service Data-heavy analysis National, regional support Tier-2 – ~100 centres in ~40 countries Simulation End-user analysis – batch and interactive

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Who is who in LCG? physics group Tier2 Lab a Uni a Lab m Lab b Uni y Uni x Tier3 physics department    Desktop Germany Taiwan UK France Italy USA Netherlands Nordic Tier-1 CERN Tier 0 Spain Canada Uni b Lab c Uni n regional group

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo LCG in the World May Grid sites 34 countries CPUs 8 PetaBytes 30 sites 3200 cpus 25 Universities 4 National Labs 2800 CPUs Grid3

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo How the LCG works? UI CE RB/BDII SE WN LFC Connections to UI Resources Searching Sent to the batch system Distribution to CPUs Ouputs copied to Storage Resources Catalogs getting track of the inputs

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Geant4 in LCG ◘ Electromagnetic and Hadronic physics are fundamental features to be properly simulated in Geant4, however they are extremely CPU demanding ▪ Number of events and energy depending: 1 event of 1GeV ~ 0.03 sec (2.4GHz machine) 1 event of 300 Gev ~ 9-10 sec ◘ Goal during the Software Validation: Comparison some shower observables between the two different Geant4 versions and check statistical significant changes ◘ Applications in LCG : First application last December 2004 Second application end of June 2005 ▪ A total amount of about 3 years of CPU time (1GHz machine) ▪ Very small output for the whole production: GB GRID

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Geant4 in LCG ◘ Samplings: ▪ 7 simplified detectors FeSci, CuSci, PbSci, CuLAr, PbLAr, WLAr, PbW04 ▪ 7 different particles (8 in the 2 nd production) e- (2 nd production), pi+, pi-, k+, k-, k0L, p, n ▪ 23 different beam energies (GeV) 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 80, 100, 120, 150, 180, 200, 250, 300, 1000(never achieved) ▪ 5 physics list LHEP, QGSP, QGSC, QGSP_BIC, QGSP_BERT

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Geant4 in LCG ◘ Strategy: ▪ First Production ◦ Comparison of 7.0.cand01 vs 6.2.p01 ◦ During the event production phase, 5635 had to be run for each Geant4 version: jobs to be run ◦ Finally the statistical tests were applied to each pair of Geant4 version outputs ▪ Second Production ◦ Comparison of 7.0.p01 vs 7.1.cand01 ◦ During the event production, 6440 jobs had to be run ◦ This time each production job contained the production of both Geant4 versions and the statistical analysis

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Geant4 Production in LCG ◘ Stages: 1. Software installation: Installation of the Geant4 packages (with all the required external additional packages: PI, AIDA, etc) ▪ Software provided via a tar file ▪ Installation through jobs using specific LCG tools ▪ Fundamental request for the sites: Shared area between WNs and perfectly definition of the software installation region 2. Events production: ▪ Jobs sent by bunches of 1227 ( nd production) defined by each physics list ▪ 5000 events in each job were produced 3. Analysis: Statistical tests to perform the comparison between the two G4 versions

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Geant4 Production in LCG ◘ General Characteristics: ▪ VO: ◦ 1 st Production: dteam (6 certificates, one as dteamsgm) ◦ 2 nd Production: alice (2certificates, one as alicesgm) ▪ Sites and middleware operating system: ◦ 1 st Production: RedHat7.3 ◦ 2 nd Production: Scientific Linux ▪ Resources: ◦ 1 st Production: Own RB+BDII+UI: lxb2006 at CERN ◦ 2 nd Production: lxplus resources and 2 BDII ▪ All output: ◦ 1 st Production: About 30 GB stored at CERN (lxn1183) ◦ 2 nd Production: Comparable quantity stored at CERN (lxn1180) afs Geant4 area at CERN was set to hold the outputs

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 ◘ Generation of a general framework consisting of 3 major tools: ▪ Tool for general and automatic job submission ▪ Tool for events generation in all those sites where the software has been installed ▪ Tool for data analysis (not needed during the 2 nd Production) ◘ First Part: Tool for job submissionMethodology: ▪ Copy and registry of the Geant4 package ◦ A file containing the TURL is created and is passed to the WN ▪ Follow up of candidates able to admit Geant4 jobs ▪ Selection of long queues only ▪ Automatic built of the.jdl files for each long queue ◦ Built taking as base those proposed by the user joining the name of the queue where to submit the job ◦ Software Installation tools are used to perform the installation ▪ Submission of these files to each queue

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 ◘ Software Installation tool (Tool submitted in the first step to all sites to install the software) ▪ First step: ◦ The tar file is copied from the SE at CERN to the WN ◦ It is untar and copied to the VO_DTEAM_SW_DIR area ▪ Second Step: Software Installation tool ◦ Some Geant4 tests are performed to validate the installation ◦ If succeeded a tag is published in the Information System ▪ Results: ◦ The software installation was tried in 63 sites ◦ 1 st Production: 28 sites ◦ 2 nd Production: 35 sites ▪ Main Problems: ◦ Sites were having submission problems ◦ Sites did not have defined the VO_DTEAM_SW_DIR area or did not have shared area among WNs

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 ◘ Second Step: Tool for the ProductionStrategy: ▪ Only long queues will be used to run the production ▪ All outputs (hbook files) will be stored at CERNMethodology: ▪ Geant4 provides their own code to perform the events production ▪ Python Script for each type of particle, energy, physics list and calorimeter is created by the framework from one template provided by Geant4 ▪ Generation of one jdl per job containing the code provided by Geant4 (the same for all jobs) + that script generated by the framework and changing for each job ▪ Submission of all jdl files to all sites containing the Geant4 installation

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 Results (First and Second Production): ▪ A hbook file containing 5000 event is created in the case the production succeeded ▪ The name of the file is created by the framework containing the type of particles, the energy, the physics list and the calorimeter within the name (important to perform later the comparison) ▪ The hbook file is copied and registered to a disk at CERN · During the 2nd production a tar file containing different files should have been created in the case the job succeeded. This file was retrieved to the afs area delivered for this aim and copied and registered to the grid. Around 4508 jobs (two physics list for both Geant4 versions) were run in lest than 2 weeks in 28 sites with a efficiency of about 87% And for the 2nd production the results are provided by Alberto Ribon

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 ◘ Before the analysis stage the outputs should be checked (only for the 1 st Production): ▪ Dealing with about 5000 outputs is not an easy task ◦ Tool able to print in a file all the LFNs in the case the efficiency was 100% (reference file) ◦ Output retrieve (only std.out files). If succeeded the file will contain the name of the LFN ◦ A 2 nd tool will check all std.out files looking for the succeeded jobs - The corresponding GUID and LFN will be stored (test file) and compared with the information included in the reference file ▪ At this point it was more important for us to analyze the successful jobs than to understand the cause of the unsuccessful ones ( BUT THIS PROCEDURE IS WRONG!)

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Framework developed for Geant4 ◘ Third Step: Tool for the Analysis (Only for the 1 st Production)Methodology: ▪ Search of common successful outputs in both Geant4 Versions ▪ Each couple of successful outputs are copied into a local area and analyzed with their own tools ▪ Finally the copy is removed from the local area

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Update of the Framework ▪ This framework covered the Geant4 requirements for its first production ▪ It’s not useful for larger production ◦ Difficult to deal with the output and visualize the results ◘ A new complete tool has been developed for large production ▪ Flexible enough to be used for any VO and any user application ▪ Most of the improvements mostly relative to the outputs handle Documentation: “LCG2 User Guide” bin/index.cgi?var=eis/docs Download:

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Update of the Framework ◘ The new framework consists mainly of two tools: ▪ Tool to perform the automatic job submission ▪ Tool to retrieve and handle the corresponding output 1. Automatic job submission Overview: Given an user’s jdl this tool performs the following actions: ◦ It lists all sites able to run the jdl provided by the user ◦ It creates automatically a jdl file based on that provided by the user ◦ It submits the just created jdl containing the user application(s) Moreover it creates a subdirectory (defined by the user) containing a list of the sites where the jobs have been submitted, the corresponding jdls and the jobs IDs

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Update of the Framework Additional Features: ◦ The user can define the queues where the jobs are submitted. These queues are checked to see whether it fixes the job requirements. ◦ Requested LFN files can be included. The corresponding TURLs are searched and included in a file passed in the InputSandbox to the WN 2. Retrieve and handle of the outputs ◦ The 2 nd tool checks the status of the jobs from the job IDs included in the directory given by the user ◦ It provides the following output: The job run in ramses.dcic.ups.es:2119/jobmanager-torque-dteam is in status: Scheduled The job run in grid01.phy.ncu.edu.tw:2119/jobmanager-torque-dteam is in status: running The job run in scaic10.scai.frauhofer.de:2119/jobmanager-torque-dteam is in status: over The user is queried to retrieve the output to the destination he has previously decided

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Update of the Framework Additional Features: It is possible to visualize the outputs on the web A html report is provided showing the files decided by the user

Enabling Grids for E-sciencE INFSO-RI Genova 13 th -20 th July Patricia Mendez Lorenzo Summary and Conclusions ◘ Satisfactory implementation of the Geant4 code in the LCG ▪ (Hopefully this is the beginning of a long friendship) ▪ Paper Geant4-LCG submitted to SC05 Conference in EEUU ◘ The LCG deployment team is quite interested in the implementation of the Geant4 code in our own tests suites ▪ We hope to collaborate together to help us as well ◘ You cannot keep on working borrowing another VOs: ▪ We have to make you VO=Geant4 as soon as possible ▪ For the next Geant4 Production this should be done ◘ Please come before in each production! ▪ Two weeks is not enough to make a good production ▪ We have to understand in all cases the failed jobs and this need time