Introduction to Alexander Richards Thanks to Mike Williams, ICL for many of the slides content.

Slides:



Advertisements
Similar presentations
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
Advertisements

Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
GANGA Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
6/4/20151 Introduction LHCb experiment. LHCb experiment. Common schema of the LHCb computing organisation. Common schema of the LHCb computing organisation.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Stuart K. PatersonCHEP 2006 (13 th –17 th February 2006) Mumbai, India 1 from DIRAC.Client.Dirac import * dirac = Dirac() job = Job() job.setApplication('DaVinci',
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Analysis demos from the experiments. Analysis demo session Introduction –General information and overview CMS demo (CRAB) –Georgia Karapostoli (Athens.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
DIRAC Web User Interface A.Casajus (Universitat de Barcelona) M.Sapunov (CPPM Marseille) On behalf of the LHCb DIRAC Team.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Robert Fourer, Jun Ma, Kipp Martin Copyright 2006 An Enterprise Computational System Built on the Optimization Services (OS) Framework and Standards Jun.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.
Process Control. Module 11 Process Control ♦ Introduction ► A process is a running occurrence of a program, including all variables and other conditions.
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Ganga Status Update Will Reece. Will Reece - Imperial College LondonPage 2 Outline User Statistics User Experiences New Features in Upcoming Features.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, November 2005.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 – The Ganga Evolution Andrew Maier.
Distributed Analysis K. Harrison LHCb Collaboration Week, CERN, 1 June 2006.
Ganga 4 Basics - Tutorial Jakub T. Moscicki ARDA/LHCb Ganga Tutorial, September 2006.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Bookkeeping Tutorial. 2 Bookkeeping content  Contains records of all “jobs” and all “files” that are produced by production jobs  Job:  In fact technically.
The GridPP DIRAC project DIRAC for non-LHC communities.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
1 LHCb computing for the analysis : a naive user point of view Workshop analyse cc-in2p3 17 avril 2008 Marie-Hélène Schune, LAL-Orsay for LHCb-France Framework,
CERN Tutorial, September Overview of LHCb applications and software environment.
July 19, 2004Joint Techs – Columbus, OH Network Performance Advisor Tanya M. Brethour NLANR/DAST.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
The GridPP DIRAC project DIRAC for non-LHC communities.
Active-HDL Server Farm Course 11. All materials updated on: September 30, 2004 Outline 1.Introduction 2.Advantages 3.Requirements 4.Installation 5.Architecture.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
L’analisi in LHCb Angelo Carbone INFN Bologna
Chapter 2: System Structures
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Submit BOSS Jobs on Distributed Computing System
The Ganga User Interface for Physics Analysis on Distributed Resources
Module 01 ETICS Overview ETICS Online Tutorials
Presentation transcript:

Introduction to Alexander Richards Thanks to Mike Williams, ICL for many of the slides content

Alex Richards, ICL2 Outline Introduction Distributed LHCb Ganga Summery

Alex Richards, ICL3 Outline Introduction Distributed LHCb Ganga Summery

Alex Richards, ICL4 Introduction LHCb takes O(100) MB/s & expects to collect O(1/2) PB in LHCb is a super bit factory (not to be confused w/ a Super B factory).

Alex Richards, ICL5 Integrated Luminosity

Alex Richards, ICL6 Resource Usage over the last month OVER 1,400 CPU YEARS! OVER 800 TB!

Alex Richards, ICL7 What is the Grid? Only a single login is required to access the system. After ID, security is handled by the system. There are several flavors of the Grid; however, in LHCb we only use the LHC Computing Grid (LCG). The Grid is a collection of computing resources located at sites around the world and consists of computing and storage elements (CEs and SEs).

Alex Richards, ICL8 Grid resources Your grid certificate is what gives you a unique identification on the Grid (2 files in your.globus directory). By joining a ‘Virtual Orginisation’ (VO), you gain access to the resources available to the VO*. By sending a grid proxy along with your Grid jobs, you allow computers to act on your behalf for a limited time. This lets your jobs run at LCG sites and read(write) files from(to) LCG SEs. If your proxy expires while some of your jobs are running on the LCG, the jobs will continue to run; however, you will not be able to access the results w/o renewing your grid proxy. * You all should have joined the LHCb VO!

Alex Richards, ICL9 Outline Introduction Distributed LHCb Ganga Summery

Alex Richards, ICL10 LHCb Division of Labor GANGA is a user-friendly frontend that handles job definition and management for LHCb users. GANGA's main goal is to ensure users are able to efficiently access all available resources (local, batch, grid, etc.). DIRAC is the workload/data management. system (WMS/DMS) for LHCb. It does the heavy lifting for all DA in LHCb. DIRAC's main goal is to insure that the VO uses its resources efficiently and to enforce job prioritisation.

11 The DIRAC WMS/DMS Job monitoring via web portal. DIRAC's many failover mechanisms greatly increase user success rates. User & production jobs happily coexist. Having only one central task queue means that the VO's highest priority jobs always run first. and, of course, all of the behind-the-scenes work the DIRAC team does investigating problems with sites, production jobs, etc. DIRAC provides us with the following benefits (not an exhaustive list): Distributed Infrastructure with Remote Agent Control

Alex Richards, ICL12 DIRAC Job Monitoring

Alex Richards, ICL13 Outline Introduction Distributed LHCb Ganga Summery

Alex Richards, ICL14 What is GANGA? GANGA provides a complete analysis environment for LHCb and greatly simplifies the user experience (the topic of the rest of this talk). Thus, the vast majority of LHCb users choose to use GANGA for most tasks. N.b. You can use DIRAC directly; however, the DIRAC team actually prefers that you use GANGA unless you really know what you're doing.

15 Efficient Usage of Computer Resources (Users) Users (should) want: Development on their laptop/desktop; full analysis utilizing all available resources (wherever they are). To get results quickly and easily. A familiar and consistent user interface for all resources. Users don’t want: To know all of the details about the Grid or other resources. To learn yet another tool to access a resource. To have to re- configure their application to run on different resources.

Alex Richards, ICL16 GANGA The GANGA mantra: Configure once, run anywhere! GANGA was developed to meet the needs of ATLAS & LHCb for a grid user interface and is now used by many other groups as well.

17 GANGA The GANGA mantra: Configure once, run anywhere! Ganga Statistics: 41% LHCb, 49% ATLAS, 10% Other ~100 unique users / day last month (ATLAS, LHCb & other)

18 GANGA Features GANGA handles the complete life cycle of a job: Build → Configure → Split → Submit → Monitor → Merge GANGA does the following (and much more) for the user: Builds/compiles applications Configures jobs, including building input sandboxes, to run on user-specified backends Submits jobs locally, to batch systems and to the grid Monitors jobs and updates the user on any status changes Automatically retrieves output when jobs complete Merges output (if requested)

Alex Richards, ICL19 GANGA LHCb Features Loading the LHCb plug-in adds the following features to GANGA: DIRAC backend and ability to contact the DIRAC server Many built-in DIRAC-based methods, e.g. Dirac().checkSites() Automatic collection of user-modified LHCb software for sandbox Input data site-based job splitting (DiracSplitter) LHCb data file (DST) merger (DSTMerger) Automatic output file discovery (from application options) Ability to checkout and build LHCb software packages Etc. (too many to list them all here). The automatic features are truly that; i.e., the user is often not even aware of them. E.g. many users forget to add their output to the GANGA job definition for LHCb applications. GANGA notices this and automatically adds the output for them (ignorance is bliss).

Alex Richards, ICL20 Since version 5.4.0, GANGA is now part of the LHCb software framework; thus, to set up the environment you should do: To run GANGA interactively (~50% of usage), do: To run GANGA on a script (~50% of usage), do: To run the GANGA GUI (~0% of usage), do: Running GANGA SetupProject Ganga ganga ganga your-script.gpi ganga --gui

21 The GANGA Prompt and Configuration GANGA is written in Python and has an enhanced Python prompt (IPython) that supports: Python syntax Shell commands TAB completion, scrolling through your history, etc. It's similar to working on the command line except Python syntax is valid and TAB completion works for Python objects, methods, variables, etc. GANGA allows the user to configure many of its settings. To permanently change a setting (i.e., to change it for the current and future sessions), simply edit it in your.gangarc file. Settings can also be viewed/changed in the current session by accessing the config object (these changes are not persisted).

Alex Richards, ICL22 GANGA Jobs GANGA jobs are handled by the Job object.

23 To create a GANGA job, simply do: You can then edit its properties (application, backend, etc.); thus, you can configure the job to do what you want. To submit the job to whatever backend you've chosen to run on, do: GANGA will monitor the job and let you know when it's done. When it's done, it'll also automatically collect the output you wanted back. N.b., once a job is submitted, you cannot modify most of its properties (there are very good reasons for this). Job Basics In[1]:j = Job() In[2]:j.submit()

Alex Richards, ICL24 Applications/Backends GANGA/LHCb supports the following types of applications: Executable (binaries, scripts, etc.); Root (ROOT macros, PyROOT scripts); Gaudi-type applications (GaudiPython, Brunel, Moore, DaVinci, Panoptes, Gauss, Boole, Bender, Vetra). GANGA/LHCb supports the following backends: Interactive (foreground on client node); Local (background on client node); Batch (LSF at CERN; SGE,PBS,Condor at other sites); Dirac (The Grid).

Alex Richards, ICL25 Splitting/Merging GANGA/LHCb supports the following splitters: Input data (SplitByFiles, DiracSplitter) Gaudi-app (GaussSplitter, OptionsFileSplitter) General (GenericSplitter, ArgSplitter) GANGA/LHCb supports the following mergers: TextMerger (text files) RootMerger (ROOT files) DSTMerger (DST files) General (SmartMerger, CustomMerger) Users often want to run a large number of similar jobs. GANGA makes this easy.

Alex Richards, ICL26 Example Job To run DaVinci tutorial, in GANGA I'd simply do: In[1]:j = Job() In[2]:j.application = DaVinci(version=‘ ') In[3]:j.application.optsfile = [' /DaVinciTutorial1.py',' /Bs2JPsiPhi.py'] In[4]:j.backend = Interactive() In[5]:j.outputsandbox = ['DVHistos 1.root'] In[6]:j.submit() To run on the Grid, we'd simply do j.backend = Dirac(). GANGA will automatically collect all of your modified files and send them w/ the job. Yes, it's really that easy!

Alex Richards, ICL27 Example Job GANGA will tell you the status of the jobs - it'll update you whenever a job changes state, you can also check directly by doing j.status. Once the jobs are complete, GANGA will download the output automatically (and merge them if needed). In[7]:j.peek() total X -rw-r--r-- 1 you z5 X Jan 5 10:00 DVHistos 1.root rwxr-xr-x 1 you z5 X Jan 5 10:00 stdout... or open a shell running ROOT with the file loaded by doing: In[8]:j.peek('DVHistos 1.root') In[8]:j.peek('stdout','less'). or specify the program you want to use: You can check the output of a job by doing, e.g.:

Alex Richards, ICL28 Datafiles and Datasets in GANGA LHCb GANGA doesn't just handle jobs, it also deals with data files & data sets: full support for logical & physical files including downloading, uploading, replicating, removing, obtaining metadata and replicas, etc. job.inputdata = browseBK() Bookkeeping queries can also be persisted in a BKQuery object and updated at any time w/o the need for the GUI or web interfaces.

Alex Richards, ICL29 GANGA Help Help is available for GANGA: Interactively in GANGA via the help function Online via the GANGA manuals and GANGA/LHCb FAQ: Via the mailing list For Python help, see In[9]:help(BKQuery)

Alex Richards, ICL30 Look Ahead… GANGA has an active group of developers, who are constantly supporting, bug fixing and upgrading its existing functionality while all the time looking to implement new features that the community request. There are many projects for upcoming enhancements to GANGA functionality, release candidates are: Application preparing – offers speed increase and disk space optimisation for many similar jobs “Tasks” framework – Completely re-written framework to establish “bookkeeping” query bases analysis job submission and management.

31 Look Ahead - Tasks Framework Within the Tasks framework we define transforms just as Jobs, except we can add a BKQuery object. In[1]: tr = LHCbAnalysisTransform(aplication=DaVinci(),backend=Dirac()) In[2]: tr.query = BKQuery(‘favourite data set’) We can then attach these transforms to a Task object either individually or cloning the application setup from the first transform to get multiple datasets. In[3]: t=LHCbAnalysisTask() In[4]: t.appendTransform(tr) In[5]: t.addQuery(tr,[BKQuery(‘’),BKQuery(‘’)]) Once the required number of transforms have been added, one per bookkeeping query we can set the task running In[6]: t.run()

32 Look Ahead - Tasks Framework We can now update the task at any time to take account of new datafiles in the dataset as well as files having been removed. In[7]: t.update () Creation of new jobs and re-submission of failed or out of date jobs is taken care of automatically by the framework. Just like with regular jobs we can use splitters and mergers to utilise the parallel nature of the grid. Can get a visual representation of tasks using In[8]: t.overview() As with jobs, tasks have their own registry accessed with In[9]: tasks

Alex Richards, ICL33 Final Note 1 - Persistence

Alex Richards, ICL34 Final Note 2 – Correct Usage of Computer Resources Testing/Debugging: Full Running:

Alex Richards, ICL35 Outline Introduction Distributed LHCb Ganga Summery

36 Summery The LCG provides LHCb users with a massive amount of CPU power and disk space. GANGA allows users to run jobs locally, on batch systems and on the Grid in a seamless way. GANGA is written in Python; its syntax is easy to understand. GANGA/LHCb provides a number of specific tools for running LHCb jobs wherever resources are available. Try getting started with the “hands on” GANGA/LHCb tutorial: