Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in 4.3.0 Upcoming Features.

Similar presentations


Presentation on theme: "Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in 4.3.0 Upcoming Features."— Presentation transcript:

1 Ganga Status Update Will Reece

2 Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in 4.3.0 Upcoming Features Reference Manual Testing Tools Summary

3 Will Reece - Imperial College LondonPage 3 User Statistics 557 Unique Users Since Jan 1, ~110 per Week 113 LHCb Users, ~25 Unique per Week http://gangamon.cern.ch:8888/ 25 Users

4 Will Reece - Imperial College LondonPage 4 User Experiences Feedback from Active LHCb Users –Helps prioritize features Tells us what Needs Improvement… –…and what is already good! Mailing Lists Good Source Will Look at Some Case Studies

5 Will Reece - Imperial College LondonPage 5 Robert Lambert Used Gauss to Generate 70m Events –Studying final state asymmetries  custom decay –Needed 10 -3 precision across 10 P t bins Compared Custom Decay with DC06 Used Ganga and DIRAC  ~4000 Jobs –2 Years of CPU Time! Very Happy with DIRAC Success rate Ganga Front-end – “Really Easy!” Likes SplitByFiles (but Replica Issues) Wants Merge of Subjobs

6 Will Reece - Imperial College LondonPage 6 Eduardo Rodrigues Toy MC Used for  Sensitivity Studies –B s  D s , B s  D s K channels –Needed large data set  Used Ganga and LCG Uses ROOT and RooFit  Root App –Ran ~3000 toy experiments –Each experiment takes 2-3 hours  1 year CPU! –Had some problems with LCG  Planning to use Dirac Using PyROOT for e.g. Simplified Studies –Root App and LCG Backend with standard python modules Has had good experience both with LSF and Grid

7 Will Reece - Imperial College LondonPage 7 Mitesh Patel Uses Ganga to Study Small Backgrounds B ±  (D 0 /D 0 )(  K ,KK,  )K ± (LHCB-2006-066) –Looking at suppressed (10 -7 ) decays to measure  B d  K*  as New Physics Probe (LHCB-2007-038) –Uses full sample b  , b   and b  c   to ntuple Likes Splitters but Would Like More Warnings Has Submitted 1000s of Jobs Benefited from Developer Support More Examples Would be Nice

8 Will Reece - Imperial College LondonPage 8 New in 4.3.0 GNU GPL License Sun Grid Engine Support Core Updates –Oracle backend for remote repository –Subjob access to job repository optimized DIRAC Support for Root Application PyROOT –Run python jobs using the ROOT libraries Gaudi Updates: ROOT Map files Many Bugfixes  Improved Stability! –Testing framework http://ganga.web.cern.ch/ganga/release/4.3.0/

9 Will Reece - Imperial College LondonPage 9 Ganga Goes GPL 4.3.0 is First GPL Release –Aim is to protect project Applies to Future Releases Ganga Used Commercially –Clear license needed http://www.gnu.org/licenses/gpl.html

10 Will Reece - Imperial College LondonPage 10 SGE Backend Now Supported Sun Grid Engine Support Added –Common batch system Can Use Following Applications –Executable –Root –Any Gaudi

11 Will Reece - Imperial College LondonPage 11 DIRAC Submission for ROOT Submit Jobs Using ROOT to DIRAC –Uses new functionality in DIRAC v2r13 DIRAC Recommended for Remote ROOT Jobs –Improved reliability –Superior job debugging info –Excellent job monitoring DIRAC is LHCb Standard for Distributed Analysis

12 Will Reece - Imperial College LondonPage 12 PyROOT Support ROOT Provides Python Bindings –Python is quick and easy to write  Productive! Ganga Now Supports Use in Root App Need Correct Python Version for ROOT –Determined Automatically LHCb Configuration: uses LCG versions – /afs/cern.ch/sw/lcg/external/ –Can be controlled in.gangarc file

13 Will Reece - Imperial College LondonPage 13

14 Will Reece - Imperial College LondonPage 14 PyROOT Support Root Documentation Updated – help(Root) in Ganga

15 Will Reece - Imperial College LondonPage 15 Gaudi Updates – ROOT Map ROOT Map used to Auto-load Libraries –Found via CMT Now Preparing for 4.3.x –Expect new LHCb Functionality in 4.3.2

16 Will Reece - Imperial College LondonPage 16 Upcoming Features Framework for Job Merging –Merge text and ROOT files Job Slices LFC Aware Splitter for Gaudi –Caching for Datasets Summary Printing of Objects Improved Credential Management Features planned for 4.3.x or 4.4: https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaIndex#GangaFourFour

17 Will Reece - Imperial College LondonPage 17 Merging of Jobs and Subjobs Jobs may have Many Subjobs Hand Merge? –Time Consuming and Error Prone  Automate Merge Subjobs –Combines subjob output Can Run on Master Job Completion… …or from Command Line Merging Text and ROOT Files Supported –What else is needed? Can Merge Lists of Jobs

18 Will Reece - Imperial College LondonPage 18 Automatic Merge Attach Merge Object to Job –Merge run on completion

19 Will Reece - Imperial College LondonPage 19 Command Line Merge Create List of Jobs to Merge –Will recursively merge subjobs Run Merge on Command Line Support Job Slices in Ganga 4.4

20 Will Reece - Imperial College LondonPage 20 Types of Merge TextMerger – Concatenate Text –Unordered, but adds headers RootMerger – Combines ROOT Files –Uses hadd  Adds histograms and trees MultipleMerger – Chain Merge Objects SmartMerger – Merge by Extension –Associations in.gangarc file

21 Will Reece - Imperial College LondonPage 21 Job Slices Change Semantics of jobs Object –Support slices  jobs[-1], jobs[0:5] –Index by Job ID  use __call__ e.g. jobs(45) Allow Job Operations on Slices – copy, fail, kill, peek, remove, resubmit, submit Job Subjobs also a Job Slice Can Create Job Slice with select –select(time='yesterday') –select(status='failed') https://twiki.cern.ch/twiki/bin/view/ArdaGrid/GangaJobIndexingSlices

22 Will Reece - Imperial College LondonPage 22 LFC Aware Splitter for Gaudi Gaudi Provides SplitByFiles –Splits job into subjobs with subset of data files Data Files not Available in all Sites –Some subjobs are unrunnable DIRAC v2r14 Allows Query of LFC –Sort files by location  optimal splitting New DiracSplitter –Splits files by file locations. Must use LFNs –Protects against mistyped file names  Error

23 Will Reece - Imperial College LondonPage 23 Performance of LFC Replica Query Last SW Week –DIRAC v2r13: LFC Query Slow –~0.5s per file  5min for 600 files DIRAC v2r14: Bulk Query –Much Improved Performance –Factor 10 times faster –30s for 600 files Thanks to DIRAC Team! DIRAC v2r13 Single Query DIRAC v2r14 Multiple Query

24 Will Reece - Imperial College LondonPage 24 Performance of LFC Replica Query Further Speed Up Needed? –Multithreaded query worse –Limited by LFC –Queue system used? Use Replica Caching –Cache stored per file –Cache date stored Users Query with Dataset –updateReplicaCache() DiracSplitter Still Slow –Will print time estimate at start Error bars show σ of 5 measurements 1397 Unique Files Queried

25 Will Reece - Imperial College LondonPage 25 Printing Summary of Objects Printing Verbose –E.g. Job object with many subjobs Summary as Default –Lists show length –Objects define own summary Get Full Print – full_print(j) –Same on object attributes

26 Will Reece - Imperial College LondonPage 26

27 Will Reece - Imperial College LondonPage 27 Improved Credential Management Ganga Manages Credentials That Expire –AFS Token, Grid Proxy Expiring Tokens Affect Ganga Session Ganga May Not Clean-Up Services on Exit Introducing InternalService Objects –Ensures correct clean-up –Services not used when expired Alert Users Before Credentials Expire Ganga Shuts Down Gracefully

28 Will Reece - Imperial College LondonPage 28 Upcoming Feature – Remote Workspaces Roaming Ganga Profile Store Workspace Remotely –Access input and output files anywhere –Work across multiple machines Local Cache Created on Demand Currently at Prototyping Stage –Exciting new functionality! Release Schedule is Uncertain

29 Will Reece - Imperial College LondonPage 29 The Ganga Reference Manual Aim is to Show Ganga Help Online –Same information as help in Ganga Documentation Generated from Source Have Prototype Online –Missing documentation to be filled in  on-going! Manual will be Generated with Release Feedback on Documentation Appreciated –Let us know if anything is not clear http://ganga.web.cern.ch/ganga/user/GPI/

30 Will Reece - Imperial College LondonPage 30

31 Will Reece - Imperial College LondonPage 31 Testing Tools Use Test Framework –Based on unittest Reports with Release Helps Find Bugs! Now Collect Coverage –Use Figleaf Library –Should improve testing –Identifies untested code

32 Will Reece - Imperial College LondonPage 32

33 Will Reece - Imperial College LondonPage 33 The LHCb Distributed Analysis Mailing List Replaces Current List for LHCb Users – project-ganga@cern.ch lhcb-distributed-analysis@cern.ch –Can sign up at http://simba2.cern.ch Encourages User Community –Less support burden for developers! https://mmm.cern.ch/public/archive-list/l/lhcb-distributed-analysis/

34 Will Reece - Imperial College LondonPage 34 Summary User Statistics: 557 Unique Users in ’07 Ganga is de facto Grid front end tool for LHCb Ganga has New Features in 4.3.0 –Dirac Handler for Root, PyROOT Support, etc. Interested Features Upcoming –Merge framework, DiracSplitter Reference Manual Coming Soon http://ganga.web.cern.ch/ganga/


Download ppt "Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in 4.3.0 Upcoming Features."

Similar presentations


Ads by Google