James Cunha Enabling Grid Computer for HEP Babar Team at University of Manchester Resources:

Slides:



Advertisements
Similar presentations
S.L.LloydATSE e-Science Visit April 2004Slide 1 GridPP – A UK Computing Grid for Particle Physics GridPP 19 UK Universities, CCLRC (RAL & Daresbury) and.
Advertisements

NorthGrid status Alessandra Forti Gridpp15 RAL, 11 th January 2006.
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EasyGrid: a job submission system for distributed.
Your university or experiment logo here BaBar Status Report Chris Brew GridPP16 QMUL 28/06/2006.
Your university or experiment logo here What is it? What is it for? The Grid.
Stephen Burke - WP8 Status - 9/5/2002 Partner Logo WP8 Status Stephen Burke, PPARC/RAL.
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
NGS computation services: API's,
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
EasyGrid: the job submission system that works! James Cunha Werner GridPP18 Meeting – University of Glasgow.
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Grid in action: from EasyGrid to LCG testbed and gridification techniques. James Cunha Werner University of Manchester Christmas Meeting
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
Implementing Metadata Using RLS/LCG James Cunha Werner University of Manchester
The B A B AR G RID demonstrator Tim Adye, Roger Barlow, Alessandra Forti, Andrew McNab, David Smith What is BaBar? The BaBar detector is a High Energy.
A tool to enable CMS Distributed Analysis
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
James Cunha Job Submission for Babar Analysis James Werner Resources:
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
EasyGrid Job Submission System and Gridification Techniques James Cunha Werner Christmas Meeting University of Manchester.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Computational grids and grids projects DSS,
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
11/30/2007 Overview of operations at CC-IN2P3 Exploitation team Reported by Philippe Olivero.
3 June 2004GridPP10Slide 1 GridPP Dissemination Sarah Pearce Dissemination Officer
Lessons for the naïve Grid user Steve Lloyd, Tony Doyle [Origin: 1645–55; < F, fem. of naïf, OF naif natural, instinctive < L nātīvus native ]native.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
…building the next IT revolution From Web to Grid…
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
BaBar and the Grid Roger Barlow Dave Bailey, Chris Brew, Giuliano Castelli, James Werner, Fergus Wilson and Will Roethel GridPP18 Glasgow March 20 th 2007.
Tier1A Status Andrew Sansum 30 January Overview Systems Staff Projects.
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
2-Sep-02Steve Traylen, RAL WP6 Test Bed Report1 RAL and UK WP6 Test Bed Report Steve Traylen, WP6
Andrew McNab - Manchester HEP - 17 September 2002 UK Testbed Deployment Aim of this talk is to the answer the questions: –“How much of the Testbed has.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Overview Background: the user’s skills and knowledge Purpose: what the user wanted to do Work: what the user did Impression: what the user think of Ganga.
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
The DataGrid Project NIKHEF, Wetenschappelijke Jaarvergadering, 19 December 2002
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
INFSO-RI Enabling Grids for E-sciencE gLite Test and Certification Effort Nick Thackray CERN.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
Grid development at University of Manchester Hardware architecture: - 1 Computer Element and 10 Work nodes Software architecture: - EasyGrid to submit.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
Vendredi 27 avril 2007 Management of ATLAS CC-IN2P3 Specificities, issues and advice.
BaBar-Grid Status and Prospects
The EDG Testbed Deployment Details
Eleonora Luppi INFN and University of Ferrara - Italy
EasyGrid: a job submission system for distributed analysis using grid
UK GridPP Tier-1/A Centre at CLRC
The LHCb Computing Data Challenge DC06
Presentation transcript:

James Cunha Enabling Grid Computer for HEP Babar Team at University of Manchester Resources:

James Cunha Human resource strategy Physicists: Roger, George, John, Jenny, Mark, Marta, Christina, Ming, Nick, Mitch, Andy 11 workers load Goals: HEP, frontiers of Physics, … Dont care with computers, grid, popcorn machine: if available, they use them Guinea Pig: James Goal: integration and support 2 * workers load Computeers: Andrews, Alessandra, Mike, Chris, Sabah 3 workers load Goals: New technologies, new technologies, new technologies, … Total demand16 workers load * Jobs with 5 events instead Millions.

James Cunha Resources Strategy Before JuneSeptember 2004 PCsGeneral interactive use SLAC terminal (Babar Software) General interactive use SLAC terminal (Babar Software) Babar Software CM2/Monte Carlo Production 40 machines 80 CPUs Test Bed: 10 CPUs LCG2 -Babar Software CM2 -Monte Carlo -Grid Application Dev Production:70 CPUs LCG2 -only CE/WN -exclusive non-babar use Know howWorkbook (Physics)Workbook (physics) A to Z Babar Computing

James Cunha Grid Test Bed

James Cunha

James Cunha Software: 850 packages. Tau Datasets: range between 60 files 1GB and 150 files 1GB Total 4,000 GB ~ 10,000 files

James Cunha Analysis Submission to Grid Single command:./easygrid dataset_name Perform Handlers management and submission Software based in State-machine –Verify skimdata available: If not available perform BbkDatasetTCL to generate skimData. Each file will be a job. –Verify if there are handlers pending If not, script generation (gera.c) with edg-job-submit and ClassAdds, and script execution. Nest for submission policy and optimisation. If yes, verify job status. When the all jobs ended, recover results in user folder. (Prototype)

James Cunha Generation and submission babar]$./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching pre selected skimdata. Searching previous handlers. Handlers not found. Submiting to GRID. Wait end of process...

James Cunha Job Status babar]$./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> Current Status: Scheduled still pendent. ### Handle -> Current Status: Scheduled still pendent. 4 jobs did not finished ! Try again later.

James Cunha Job Status and recovery babar]$./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> Current Status: Done Exit code: 0 ### Handle -> Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/babar/jamwer_foRHhWyeDBnbqA9JkDADLg /home/jamwer/grid_sub/babar/jamwer_8DdK3xruxtevNpei3zZbaA

James Cunha Monte Carlo Submission to Grid Single Command:./mcgrid JobName num_copies Perform Handlers management and submission. Software based in State-Machine: –Verify if there are handlers pending If not, script generation (geramc.c) with edg-job-submit and ClassAdds for each copy, and script execution. Nest for submission policy and optimisation. If yes, verify job status. When the all jobs ended, recover results in user folder. (Prototype)

James Cunha MC Submission mcgrid1]$./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching previous handlers. Handlers not found. Submiting to GRID. Wait end of process...

James Cunha Job Status mcgrid1]$./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching previous handlers. Checking if jobs finished. ### Handle -> Current Status: Scheduled still pendent. ### Handle -> Current Status: Ready still pendent. ### Handle -> Current Status: Ready still pendent. 3 jobs did not finished ! Try again later.

James Cunha Job status and recovery mcgrid1]$./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy Done Creating proxy Done Searching previous handlers. Checking if jobs finished. ### Handle -> Current Status: Done Exit code: 0 ### Handle -> Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/mcgrid1/jamwer_9WzceoIMEQoTK24a-UvOmw /home/jamwer/grid_sub/mcgrid1/jamwer_c4iCB8vioozaGteI9hybIg /home/jamwer/grid_sub/mcgrid1/jamwer_L5BD1OE--eckTm5RXkp2nA

James Cunha Testing Submission Script Load Range: Worker load x #Files –16 x 60 files = 960 jobs pendent –16 x 150 files = 2400 jobs pendent Test with Submission script 100 Jobs1000 Jobs SubmissionResult recovery SubmissionResult recovery Done Aborted ** Scheduled79 Fail1**630 *630 * sslv3 alert handshake failure ** Please wait job enter the Done status. This never happens! Resource Broker not reliable or robust. Sometimes failure 3 days a week or takes hours to submit/dispatch to CE (empty!).

James Cunha Pending Infrastructure => Course of action Babar Software Know How is not available at Manchester => Web Page & Network skills. Quality Assurance => We are OK! from benchmark (E x P) Real Application to perform complete cycle, acquire know how, and grid prof-of-concept is missing => Partnership with physicists CERN does NOT recognise Babar Community => Lets reduce their priority! RB at Manchester => 60MB binaries and policies freedom. SE/RC at Manchester => policies and submission jobs freedom. Mass storage (10TB) for Babar purposes => CAP! UI in the AFS => wide access to Manchester farms. Apprenticeship at RAL and later at SLAC – production and experiment => improve where others fail Configuration for optimal job performance/submission at Tear 2 (1 Ce x 50 WN? Performance dCache with Babar Software? Why 10TB if Liverpool bought 80TB? Electricity bill? => analyse procedures to improve QoS and better Site Configuration Update (software and data) and operational policies => operational standards to achieve high QoS

James Cunha Aimed Hardware Architecture (Redundant RB with alternate access)

James Cunha Aimed Software Architecture

James Cunha Production Job Submission Package Operational policies/integration with RB (application level). Recovery of aborted status. Resources optimisation. Integration with RC (application level) for replicas policies development. Interactive data visualisation (Useful?) Integration with GridSite (Data visualisation, analysis, performance monitor, and submission) Professional version.

James Cunha Integrate LCG2 and Job Submission with Babar/CM2 at University of Manchester for Tau Physics modelling, analysis and MC generation. We aim to be soon… The largest site in UK. Leader in grid computing and HEP Summary

James Cunha Conclusion Babar CM2 is running at Manchester! LCG2 Grid is running with real world experiment! Babar submission prototype to Grid is running ! LCG is not LHC software only! It is Babars. We are doing today what will take years to you to achieve. Lets work together!