Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in 4.3.0 Upcoming Features.

Slides:



Advertisements
Similar presentations
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Advertisements

CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
13/05/2004Janusz Martyniak Imperial College London 1 Using Ganga to Submit BaBar Jobs Development Status.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Ganga Developments Karl Harrison (University of Cambridge) 18th GridPP Meeting University of Glasgow, 20th-21st March 2007
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
Nightly Releases and Testing Alexander Undrus Atlas SW week, May
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
David Adams ATLAS DIAL status David Adams BNL November 21, 2002 ATLAS software meeting GRID session.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Introduction to Alexander Richards Thanks to Mike Williams, ICL for many of the slides content.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
Successful Distributed Analysis ~ a well-kept secret K. Harrison LHCb Software Week, CERN, 27 April 2006.
Karsten Köneke October 22 nd 2007 Ganga User Experience 1/9 Outline: Introduction What are we trying to do? Problems What are the problems? Conclusions.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, November 2005.
Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
David Adams ATLAS ATLAS Distributed Analysis: Overview David Adams BNL December 8, 2004 Distributed Analysis working group ATLAS software workshop.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
1 DIRAC Project Status A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille 10 March, DIRAC Developer meeting.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
Maria Alandes Pradillo, CERN Training on GLUE 2 information validation EGI Technical Forum September 2013.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
L’analisi in LHCb Angelo Carbone INFN Bologna
Akiya Miyamoto KEK 1 June 2016
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
BOSS: the CMS interface for job summission, monitoring and bookkeeping
The Ganga User Interface for Physics Analysis on Distributed Resources
X in [Integration, Delivery, Deployment]
Data Challenge 1 Closeout Lessons Learned Already
Condor-G Making Condor Grid Enabled
Status and plans for bookkeeping system and production tools
Presentation transcript:

Ganga Status Update Will Reece

Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in Upcoming Features Reference Manual Testing Tools Summary

Will Reece - Imperial College LondonPage 3 User Statistics 557 Unique Users Since Jan 1, ~110 per Week 113 LHCb Users, ~25 Unique per Week 25 Users

Will Reece - Imperial College LondonPage 4 User Experiences Feedback from Active LHCb Users –Helps prioritize features Tells us what Needs Improvement… –…and what is already good! Mailing Lists Good Source Will Look at Some Case Studies

Will Reece - Imperial College LondonPage 5 Robert Lambert Used Gauss to Generate 70m Events –Studying final state asymmetries  custom decay –Needed precision across 10 P t bins Compared Custom Decay with DC06 Used Ganga and DIRAC  ~4000 Jobs –2 Years of CPU Time! Very Happy with DIRAC Success rate Ganga Front-end – “Really Easy!” Likes SplitByFiles (but Replica Issues) Wants Merge of Subjobs

Will Reece - Imperial College LondonPage 6 Eduardo Rodrigues Toy MC Used for  Sensitivity Studies –B s  D s , B s  D s K channels –Needed large data set  Used Ganga and LCG Uses ROOT and RooFit  Root App –Ran ~3000 toy experiments –Each experiment takes 2-3 hours  1 year CPU! –Had some problems with LCG  Planning to use Dirac Using PyROOT for e.g. Simplified Studies –Root App and LCG Backend with standard python modules Has had good experience both with LSF and Grid

Will Reece - Imperial College LondonPage 7 Mitesh Patel Uses Ganga to Study Small Backgrounds B ±  (D 0 /D 0 )(  K ,KK,  )K ± (LHCB ) –Looking at suppressed (10 -7 ) decays to measure  B d  K*  as New Physics Probe (LHCB ) –Uses full sample b  , b   and b  c   to ntuple Likes Splitters but Would Like More Warnings Has Submitted 1000s of Jobs Benefited from Developer Support More Examples Would be Nice

Will Reece - Imperial College LondonPage 8 New in GNU GPL License Sun Grid Engine Support Core Updates –Oracle backend for remote repository –Subjob access to job repository optimized DIRAC Support for Root Application PyROOT –Run python jobs using the ROOT libraries Gaudi Updates: ROOT Map files Many Bugfixes  Improved Stability! –Testing framework

Will Reece - Imperial College LondonPage 9 Ganga Goes GPL is First GPL Release –Aim is to protect project Applies to Future Releases Ganga Used Commercially –Clear license needed

Will Reece - Imperial College LondonPage 10 SGE Backend Now Supported Sun Grid Engine Support Added –Common batch system Can Use Following Applications –Executable –Root –Any Gaudi

Will Reece - Imperial College LondonPage 11 DIRAC Submission for ROOT Submit Jobs Using ROOT to DIRAC –Uses new functionality in DIRAC v2r13 DIRAC Recommended for Remote ROOT Jobs –Improved reliability –Superior job debugging info –Excellent job monitoring DIRAC is LHCb Standard for Distributed Analysis

Will Reece - Imperial College LondonPage 12 PyROOT Support ROOT Provides Python Bindings –Python is quick and easy to write  Productive! Ganga Now Supports Use in Root App Need Correct Python Version for ROOT –Determined Automatically LHCb Configuration: uses LCG versions – /afs/cern.ch/sw/lcg/external/ –Can be controlled in.gangarc file

Will Reece - Imperial College LondonPage 13

Will Reece - Imperial College LondonPage 14 PyROOT Support Root Documentation Updated – help(Root) in Ganga

Will Reece - Imperial College LondonPage 15 Gaudi Updates – ROOT Map ROOT Map used to Auto-load Libraries –Found via CMT Now Preparing for 4.3.x –Expect new LHCb Functionality in 4.3.2

Will Reece - Imperial College LondonPage 16 Upcoming Features Framework for Job Merging –Merge text and ROOT files Job Slices LFC Aware Splitter for Gaudi –Caching for Datasets Summary Printing of Objects Improved Credential Management Features planned for 4.3.x or 4.4:

Will Reece - Imperial College LondonPage 17 Merging of Jobs and Subjobs Jobs may have Many Subjobs Hand Merge? –Time Consuming and Error Prone  Automate Merge Subjobs –Combines subjob output Can Run on Master Job Completion… …or from Command Line Merging Text and ROOT Files Supported –What else is needed? Can Merge Lists of Jobs

Will Reece - Imperial College LondonPage 18 Automatic Merge Attach Merge Object to Job –Merge run on completion

Will Reece - Imperial College LondonPage 19 Command Line Merge Create List of Jobs to Merge –Will recursively merge subjobs Run Merge on Command Line Support Job Slices in Ganga 4.4

Will Reece - Imperial College LondonPage 20 Types of Merge TextMerger – Concatenate Text –Unordered, but adds headers RootMerger – Combines ROOT Files –Uses hadd  Adds histograms and trees MultipleMerger – Chain Merge Objects SmartMerger – Merge by Extension –Associations in.gangarc file

Will Reece - Imperial College LondonPage 21 Job Slices Change Semantics of jobs Object –Support slices  jobs[-1], jobs[0:5] –Index by Job ID  use __call__ e.g. jobs(45) Allow Job Operations on Slices – copy, fail, kill, peek, remove, resubmit, submit Job Subjobs also a Job Slice Can Create Job Slice with select –select(time='yesterday') –select(status='failed')

Will Reece - Imperial College LondonPage 22 LFC Aware Splitter for Gaudi Gaudi Provides SplitByFiles –Splits job into subjobs with subset of data files Data Files not Available in all Sites –Some subjobs are unrunnable DIRAC v2r14 Allows Query of LFC –Sort files by location  optimal splitting New DiracSplitter –Splits files by file locations. Must use LFNs –Protects against mistyped file names  Error

Will Reece - Imperial College LondonPage 23 Performance of LFC Replica Query Last SW Week –DIRAC v2r13: LFC Query Slow –~0.5s per file  5min for 600 files DIRAC v2r14: Bulk Query –Much Improved Performance –Factor 10 times faster –30s for 600 files Thanks to DIRAC Team! DIRAC v2r13 Single Query DIRAC v2r14 Multiple Query

Will Reece - Imperial College LondonPage 24 Performance of LFC Replica Query Further Speed Up Needed? –Multithreaded query worse –Limited by LFC –Queue system used? Use Replica Caching –Cache stored per file –Cache date stored Users Query with Dataset –updateReplicaCache() DiracSplitter Still Slow –Will print time estimate at start Error bars show σ of 5 measurements 1397 Unique Files Queried

Will Reece - Imperial College LondonPage 25 Printing Summary of Objects Printing Verbose –E.g. Job object with many subjobs Summary as Default –Lists show length –Objects define own summary Get Full Print – full_print(j) –Same on object attributes

Will Reece - Imperial College LondonPage 26

Will Reece - Imperial College LondonPage 27 Improved Credential Management Ganga Manages Credentials That Expire –AFS Token, Grid Proxy Expiring Tokens Affect Ganga Session Ganga May Not Clean-Up Services on Exit Introducing InternalService Objects –Ensures correct clean-up –Services not used when expired Alert Users Before Credentials Expire Ganga Shuts Down Gracefully

Will Reece - Imperial College LondonPage 28 Upcoming Feature – Remote Workspaces Roaming Ganga Profile Store Workspace Remotely –Access input and output files anywhere –Work across multiple machines Local Cache Created on Demand Currently at Prototyping Stage –Exciting new functionality! Release Schedule is Uncertain

Will Reece - Imperial College LondonPage 29 The Ganga Reference Manual Aim is to Show Ganga Help Online –Same information as help in Ganga Documentation Generated from Source Have Prototype Online –Missing documentation to be filled in  on-going! Manual will be Generated with Release Feedback on Documentation Appreciated –Let us know if anything is not clear

Will Reece - Imperial College LondonPage 30

Will Reece - Imperial College LondonPage 31 Testing Tools Use Test Framework –Based on unittest Reports with Release Helps Find Bugs! Now Collect Coverage –Use Figleaf Library –Should improve testing –Identifies untested code

Will Reece - Imperial College LondonPage 32

Will Reece - Imperial College LondonPage 33 The LHCb Distributed Analysis Mailing List Replaces Current List for LHCb Users – –Can sign up at Encourages User Community –Less support burden for developers!

Will Reece - Imperial College LondonPage 34 Summary User Statistics: 557 Unique Users in ’07 Ganga is de facto Grid front end tool for LHCb Ganga has New Features in –Dirac Handler for Root, PyROOT Support, etc. Interested Features Upcoming –Merge framework, DiracSplitter Reference Manual Coming Soon