Presentation is loading. Please wait.

Presentation is loading. Please wait.

4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory.

Similar presentations


Presentation on theme: "4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory."— Presentation transcript:

1 4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory

2 4th February 2004GRIDPP92 LHCb - Reminder RICH1 RICH2 Calorimeters Muon System VELO Magnet Tracking stations (inner and outer) 20 m 1.2M electronic channels Weight ~4,000 tonnes b d B meson b d Anti-B meson

3 4th February 2004GRIDPP93 LHCb GridPP Development LHCb development has been taking place on three fronts: MC Production Control and Monitoring Gennady Kuznetsov (RAL) Data Management Carmine Cioffi (Oxford) Karl Harrison (Cambridge) GANGA Alexander Soroko (Oxford) Karl Harrison (Cambridge) All developed in tandem with LHCb Data Challenges

4 4th February 2004GRIDPP94 Data Challenge DC03 VELOTT RICH2 RICH1  65M events processed.  Distributed over 19 different centres.  Averaged 830,000 events/day.  Equivalent to 2,300 × 1.5GHz computers.  34% processed in UK at 7 different institutes.  All data written to CERN. “Physics” Data Challenge. Used to redesign and optimise detector …

5 4th February 2004GRIDPP95 The LHCb Detector Changes were made for material reduction and L1 trigger improvement Reduced number of layers for M1 (4  2) Reduced number of tracking stations behind the magnet (4  3) No tracking chambers in the magnet No B field shielding plate Full Si station Reoptimized RICH-1 design Reduced number of VELO stations (25  21)

6 4th February 2004GRIDPP96 “Detector” TDRs completed Only Computing TDR remains

7 4th February 2004GRIDPP97 Data Challenge 2004 “Computing” Data Challenge. April – June 2004 Produce 10 × more events. At least 50% to be done via LCG. Store data at nearest Tier-1 (i.e. RAL for UK institutes) Try out distributed analysis. Test computing model and write computing TDR. Require stable LCG2 release with SRM interfaced to RAL DataStore

8 4th February 2004GRIDPP98 DC04: UK Tier-2 Centres NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD ScotGrid Durham, Edinburgh, Glasgow LondonGrid Brunel, Imperial, QMUL, RHUL, UCL 11 01 10 00 11

9 4th February 2004GRIDPP99 DIRAC Architecture Information Service Authentication Authorisation Auditing Grid Monitoring Workload Management Metadata Catalogue File Catalogue Data Management Computing Element Storage Element Job Provenance Package Manager User Interface API Accounting DIRAC components Other project components: AliEn, LCG, … Resources: LCG, LHCb production sites

10 4th February 2004GRIDPP910 MC Control Status Gennady Kuznetsov Control toolkit breaking down production workflow into components – modules, steps. To be deployed in DC04. SUCCESS! DIRAC Distributed Infrastructure with Remote Agent Control

11 4th February 2004GRIDPP911 Bookkeeping data Monitoring info Get jobs Site A Site B Site C Site D Agent Production service Monitoring service Bookkeeping service Agent DIRAC v1.0 Original scheme “Pull” rather than “Push”

12 4th February 2004GRIDPP912 Components – MC Control Module Step Workflow Job Production Levels of usage: 1.Module – Programmer 2.Step – Production Manager 3.Workflow – User/Production manager Module is the basic component of the architecture Each step generates job as a Python program. This structure allow the Production Manager to construct any algorithm as a combination of modules. Gennady Kuznetsov

13 4th February 2004GRIDPP913 Module Editor Python code of single module. Can be many classes. Module variables. Description Module Name Stored as XML file Gennady Kuznetsov

14 4th February 2004GRIDPP914 Step Editor Step Name Description Definitions of Modules Instances of Modules Variables of currently selected instance Selected instance Stored as XML file, where all modules are embedded Gennady Kuznetsov Step variables.

15 4th February 2004GRIDPP915 Workflow Editor Gennady Kuznetsov Workflow Name Description Step Definitions Step Instances Variables of currently selected Step Instance Selected Step Instance Workflow Variables. Stored as XML file

16 4th February 2004GRIDPP916 Job Splitting Gennady Kuznetsov Step Workflow Definition Job Production Python List The input value for the job splitting is a Python list object. Every single (top level) element of this list applies to the Workflow Definition and propagates through the code and generates a single element of the production (one or several jobs).

17 4th February 2004GRIDPP917 Future: Production Console Once an agent has received a workflow, the Production Manager has no control over any function in a remote centre. Local Manager must perform all of the configurations and interventions at individual site. Develop ”Production Console” which will provide extensive control and monitoring functions for the Production Manager. Monitor and configure remote agents. Data replication control. Intrusive system – need to address Grid security mechanisms and provide robust environment.

18 4th February 2004GRIDPP918 DIRAC v1.0 Architecture Production Manager

19 4th February 2004GRIDPP919 DIRAC v2.0 WMS Architecture Production Service Also data stored remotely Based on central queue service

20 4th February 2004GRIDPP920 Data Management Status Carmine Cioffi File catalogue browser for POOL Integration of POOL persistency framework into GAUDI  new EventSelector interface. SUCCESS!

21 4th February 2004GRIDPP921 List of LFNs Tabs for LFN / PFN mode selection List of PFNs associated to the LFN selected from the list of LFNs on the left sub-panel Read the next and previous bunch of files from the catalog Write mode selection Import the fragment of a catalog Reload the catalog Shows the metadata schema, with the possibility to change it List all the metadata value of the catalog List the files selected Search text bar Filter text bar. Main Panel, LFN Mode Browsing POOL file catalogue provides LFN & PFN association. Browser allows user to interact with catalogue via GUI. Can save list of LFNs for job sandbox

22 4th February 2004GRIDPP922 Main Panel, PFN Mode Browsing Sub menu with three operations to be done on the file selected. In PFN mode, the files are browsed in the same way as Windows Explorer. The folders are shown on the left sub-panel and the value of the folder on the right sub-panel. Write mode button opens WrFCBrowser frame allowing user to write to the catalogue…

23 4th February 2004GRIDPP923 Register a PFN Add a PFN replica Delete a PFN Add LFN Remove a LFN Add metadata value Rollback Commit Show the action performed Write Mode Panel

24 4th February 2004GRIDPP924 PFN register frame Frame to show and change the metadata schema of the catalog This frame allows setting of the metadata value

25 4th February 2004GRIDPP925 This frame shows the metadata value of the PFN Myfile Shows the list of the files selected This frame shows the attribute value of the PFN

26 4th February 2004GRIDPP926 Benefit from investment in LCG Retire parts of Gaudi  reduce maintenance. Designed and implemented a new interface for the LHCb EventSelector. Criteria:  One or more “datasets” (e.g. list of runs, list of files matching a given criteria).  One or more “EventTagCollections” with extra selection based on Tag values.  One or more physical files. Result of an event selection is a virtual list of event pointers. GAUDI/POOL Integration

27 4th February 2004GRIDPP927 Physicist’s View of Event Data Gaudi Bookkeeping Dataset Event 1 Event 2 … Event 3 Dataset Event 1 Event 2 … Event 3 File Event 1 Event 2 … Event N Files RAW2-1/1/2008 RAW3-22/9/2007 RAW4-2/2/2008 … Dataset Event 1 Event 2 … Event 3 Dataset Event 1 Event 2 … Event 3 Event tag collctn Tag 15 0.3 Tag 22 1.2 … Tag M8 3.1 Collection Set B -> ππ Candidates (Phy) B -> J/Ψ (μ + μ - ) Candidates …

28 4th February 2004GRIDPP928 Future: Data to Metadata File catalogue holds only a minimal amount of metadata. LHCb deploys a separate “bookkeeping” database service to store the metadata for datasets and event collections. Based on central ORACLE server at CERN with query service through XML-RPC interface. Not scaleable, particularly for Grid, and completely new metadata solution required. ARDA based system will be investigated. Vital that this is development is optimised for LHCb and synchronised with data challenges. Corresponds to ARDA Job Provenance DB and Metadata Catalogue

29 4th February 2004GRIDPP929 DIRAC Metadata: Data Production Information Flow Job.xml Build new configuration Selection of Defaults Production done Prod.Mgr Configuration Bookkeeping Data Production Production Jobs File Catalogue

30 4th February 2004GRIDPP930 Metadata: Data Analysis User Job Information Flow Job.opts Modify Defaults User Select input data Pick-up default configuration Bookkeeping Configuration File Catalogue DIRAC

31 4th February 2004GRIDPP931 LHCb GANGA Status Alexander Soroko, Karl Harrison User Grid Interface. First prototype released in April 2003. To be deployed for LHCb 2004 Data Challenge. SUCCESS! LHCb ATLAS BaBar + Alvin Tan Janusz Martyniak

32 4th February 2004GRIDPP932 GANGA for LHCb GANGA will allow LHCb user to perform standard analysis tasks: Data queries. Configuration of jobs, defining the job splitting/merging strategy. Submitting jobs to the chosen Grid resources. Following the progress of jobs. Retrieval of job output. Job bookkeeping.

33 4th February 2004GRIDPP933 GANGA User Interface Database of Standard Job Options Job Options Editor Strategy Database (Splitting scripts) Strategy SelectionData Selection (Input/Output Files) Job Requirements (LSF Resources, etc) Job Factory (Job Registry Class) Ganga Job object Local Client Grid/Batch System Gatekeeper Submit job Send job output Worker nodes Get job output Send Job script JDL file Job Options file Get Monitoring Info Storage Element File Transfer

34 4th February 2004GRIDPP934 Software Bus User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus Software Bus itself is a Ganga component implemented in Python Components used by Ganga fall into 3 categories: Ganga components of general applicability or Core Components (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram) User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software Bus Software Bus itself is a Ganga component implemented in Python Components used by Ganga fall into 3 categories: Ganga components of general applicability or Core Components (to right in diagram) Ganga components providing specialised functionality (to left in diagram) External components (at bottom in diagram) Job Definition Job Registry Job Handling File Transfer Python Native Software Bus CLI GUI Python Root Gaudi Python PyCMT PyAMI Py Magda BaBar Job Definition and Splitting Gaudi/Athena Job Options Editor Gaudi/Athena Job Definition

35 4th February 2004GRIDPP935 GUIs Galore

36 4th February 2004GRIDPP936 DIRAC WMS Architecture GANGA

37 4th February 2004GRIDPP937 Future Plans Database of Standard Job Options Job-Options Editor Job-Options Template Job-Options Knowledge Base Dataset Dataset Catalogue Dataset Selection Job Factory (Machinery for Generating XML Descriptions of Multiple Jobs) Strategy Selection Job Collection (XML Description) User Requirements Database of Job Requirements Derived Requirements Job Requirements Strategy Database (Splitter Algorithms) DispatcherScheduler Proxy Scheduler Service Remote-Client Scheduler Grid/ Batch-System Scheduler Agent (Runs/Validates Job) Software Cache Component Cache Software/Component Server Remote Client Local Client Execution node NorduGrid Local DIAL DIRAC Other JDL, Classads, LSF Resources, etc LSF PBS EDG USG Refactorisation of Ganga, with submission on remote client Motivation Ease integration of external components Facilitate multi- person, distributed development Increase customizability/ flexibility Permit GANGA components to be used externally more simple Motivation Ease integration of external components Facilitate multi- person, distributed development Increase customizability/ flexibility Permit GANGA components to be used externally more simple 2 nd GANGA prototype ~ April 2004

38 4th February 2004GRIDPP938 Future: GANGA Develop into generic front-end capable of submitting a range of applications to the Grid. Requires central core and modular structure (started with version 2 re-factorisation) to allow new frameworks to be plugged in. Enable GANGA to be used in complex analysis environment over many years for many users. Hierarchical structure, import/export facility, schema evolution, etc. Interact with multiple Grids (e.g. LCG, NorduGrid, EGEE…). Needs to keep pace with development of Grid services. Synchronise with ARDA developments. Interactive analysis? ROOT, PROOF


Download ppt "4th February 2004GRIDPP91 LHCb Development Glenn Patrick Rutherford Appleton Laboratory."

Similar presentations


Ads by Google