Your university or experiment logo here BaBar Status Report Chris Brew GridPP16 QMUL 28/06/2006.

Slides:



Advertisements
Similar presentations
London Tier2 Status O.van der Aa. Slide 2 LT 2 21/03/2007 London Tier2 Status Current Resource Status 7 GOC Sites using sge, pbs, pbspro –UCL: Central,
Advertisements

1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
NorthGrid status Alessandra Forti Gridpp15 RAL, 11 th January 2006.
James Cunha Enabling Grid Computer for HEP Babar Team at University of Manchester Resources:
1 ALICE Grid Status David Evans The University of Birmingham GridPP 16 th Collaboration Meeting QMUL June 2006.
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Applications Area Issues RWL Jones GridPP13 – 5 th June 2005.
B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.
Andrew McNab - Manchester HEP - 17 September 2002 Putting Existing Farms on the Testbed Manchester DZero/Atlas and BaBar farms are available via the Testbed.
Clusters, SubClusters and Queues A Spotters Guide Chris Brew HepSysMan 06/11/2008.
The GATE-LAB system Sorina Camarasu-Pop, Pierre Gueth, Tristan Glatard, Rafael Silva, David Sarrut VIP Workshop December 2012.
Andrew McNab - Manchester HEP - 2 May 2002 Testbed and Authorisation EU DataGrid Testbed 1 Job Lifecycle Software releases Authorisation at your site Grid/Web.
EasyGrid: the job submission system that works! James Cunha Werner GridPP18 Meeting – University of Glasgow.
WHICH TO CHOOSE RIGHT SERVER FOR THE RIGHT JOB. Today’s business environment demands that small and midsize businesses do more with less. The large majority.
Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
Grid in action: from EasyGrid to LCG testbed and gridification techniques. James Cunha Werner University of Manchester Christmas Meeting
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.
Stefano Belforte INFN Trieste 1 CMS SC4 etc. July 5, 2006 CMS Service Challenge 4 and beyond.
JetWeb on the Grid Ben Waugh (UCL), GridPP6, What is JetWeb? How can JetWeb use the Grid? Progress report The Future Conclusions.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Grid Canada CLS eScience Workshop 21 st November, 2005.
EasyGrid Job Submission System and Gridification Techniques James Cunha Werner Christmas Meeting University of Manchester.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Lessons for the naïve Grid user Steve Lloyd, Tony Doyle [Origin: 1645–55; < F, fem. of naïf, OF naif natural, instinctive < L nātīvus native ]native.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
GridPP Presentation to AstroGrid 13 December 2001 Steve Lloyd Queen Mary University of London.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Stefano Belforte INFN Trieste 1 Middleware February 14, 2007 Resource Broker, gLite etc. CMS vs. middleware.
…building the next IT revolution From Web to Grid…
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
GridPP Collaboration Meeting 5 th November 2001 Dan Tovey, University of Sheffield Non-LHC and Non-US-Collider Experiments’ Requirements Dan Tovey, University.
BaBar and the Grid Roger Barlow Dave Bailey, Chris Brew, Giuliano Castelli, James Werner, Fergus Wilson and Will Roethel GridPP18 Glasgow March 20 th 2007.
A B A B AR InterGrid Testbed Proposal for discussion Robin Middleton/Roger Barlow Rome: October 2001.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Monitoring for CCRC08, status and plans Julia Andreeva, CERN , F2F meeting, CERN.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
The GridPP DIRAC project DIRAC for non-LHC communities.
BaBarGrid UK Distributed Analysis Roger Barlow Montréal collaboration meeting June 22 nd 2006.
BaBar and the GRID Tim Adye CLRC PP GRID Team Meeting 3rd May 2000.
The GridPP DIRAC project DIRAC for non-LHC communities.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
Grid development at University of Manchester Hardware architecture: - 1 Computer Element and 10 Work nodes Software architecture: - EasyGrid to submit.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
BaBar-Grid Status and Prospects
Eleonora Luppi INFN and University of Ferrara - Italy
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
US CMS Testbed.
MonteCarlo production for the BaBar experiment on the Italian grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

Your university or experiment logo here BaBar Status Report Chris Brew GridPP16 QMUL 28/06/2006

Outline 3 BaBar Grid Projects: –Monte Carlo (Simulation) Production –Skimming –User Analysis easyGrid bbrbsub Overall experience with the Grid Conclusion

Usual Guff BaBar is a running experiment, Situated at SLAC near San Francisco e + e - collider tuned to investigate CP Violation in B Physics Started taking data in 1999/2000 currently has 350 fb -1 of data Projected to have 1000 fb -1 by end of 2008

Data Flow Tier 1 (RAL) Tier 2s Tier 0 (SLAC) Large Tier 2s Tier 1 (RAL) Simulation Production Skimming Analysis Merging

Simulation Production Running at M/Cr, RAL, RALPP and B'ham –Tests at Lancs, Oxford + others –Still working to add other BaBar Sites –Limited by need to install Objy DB at each site Stable running: 500,000,000 Events Produced, 12% of worldwide total. New R-GMA Based job monitor: Status query down from 45 minutes to 5 minutes Recent hiatus due to bugs found in BaBar simulation code which caused a global halt. Production has recently restarted C. Brew, G.Castelli

Skimming New Grid Project: Process real and simulated data to select ~200 subsamples, defined by the BaBar physics analysis working groups. –Much quicker to run over skim than full data sample –Skimming includes physics analysis code and saves the results, so CPU time spent in skimming is regained many times over Plan is to run at one or more large T2s. If we can get this into production we should be able to recover some of the UKs Common Fund rebate weve lost due to lack of T1 Resources GridPP has funded three months of effort from Will Roethel to further this work G.Castelli, W. Roethel, C. Brew

Status of Skimming Prepare code to be installed on gridDone Modify BaBar framework to read data out of dCache and RFIO Working, starting load and stability testing Develop tools for copying and managing data on Storage Elements Under development (PHeDEx?) Integration with BaBar Task Management software Task DB CreationDone Task List CreationWorks Job CreationWorks Local Job SubmissionWorks Grid Job SubmissionWorks Job MonitoringIn progress, should be able to reuse code from SP Tools Job Recovery Job Output CheckingIn progress Data MergingNot Started

User Analysis (easyGrid) Prototype running on Manchester Testbed testbed (80 CPUs) since Nov/2005 without problems. Real analysis with real data by real users that knows nothing about grid. No errors in Easygrid job submission. No errors in grid testbed due to installation configuration and improvements. J. Werner

Many problems encountered moving from Testbed to Production Grid Resources –errors in RB, CE, etc - 10% of time with less then 4 jobs/second submission rate. –errors in BDII, SE, dcache. SE fails 40% of jobs (less then 100 jobs in parallel). –when SE works, performance is terrible (approx. 8 times more time to run same software). –lack of response to problems from site admins. Serious issue for a typical user analysis which is about CPU hour jobs Product development will be resumed when resources are available and reliable. Meanwhile, EasyGrid prototype and M/Cr testbed will attend users For more information:

User Analysis (bbrbsub) Integration of Simple Job Manager + bbrbsub with Grid Submission Take the tools already used by analysis users to submit jobs at RAL Transparently add RAL -> RAL grid submission Add RAL -> M/Cr and M/Cr -> RAL submission capabilities Add RAL -> RALPP and M/Cr -> RALPP Gradually build up full grid functionality –Application transport and configuration –Automatic output recovery –Job to data matching G. Castelli

Overall Grid Experience Grid is still not reliable (worst test run): SP running seems to indicate that Grid isn't getting more reliable and may be getting less so, long term efficiency stuck around 80%: –RB Problems (have capability of multiple RB use but efficiency drops because of lack of fail over) –Central LFC problems –BDII problems - Sites drop in and out of bdii –SE Problems - Files randomly don't up/download Could run for 1-2 weeks at a time with minimal intervention, now seems to need daily (or more) interventions RAL to RAL Successful Job Rate GridPBS <50%>99%

Conclusions BaBar has made good progress on moving its three main offline compute intensive processes to the Grid Monte-Carlo generation is in production, significant progress has been made in skimming and user analysis There are many things we like about the grid We are adapting the BaBar software framework to integrate better with the grid, the dependence on Objectivity will be removed and we are adding the ability to read data directly from Storage Elements However, reliability and ease of use are still big issues