ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

The Latest news … and Future of ATLAS Databases Elizabeth Gallas - Oxford ATLAS Software & Computing Workshop CERN November 29 to December 3, 2010.
Conditions and configuration metadata for the ATLAS experiment E J Gallas 1, S Albrand 2, J Fulachier 2, F Lambert 2, K E Pachal 1, J C L Tseng 1, Q Zhang.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Conditions Metadata for TAGs Elizabeth Gallas, (Ryan Buckingham, Jeff Tseng) - Oxford ATLAS Software & Computing Workshop CERN – April 19-23, 2010.
29 July 2008Elizabeth Gallas1 An introduction to “TAG”s for ATLAS analysis Elizabeth Gallas Oxford Oxford ATLAS Physics Meeting Tuesday 29 July 2008.
ATLAS: Database Strategy (and Happy Thanksgiving) Elizabeth Gallas (Oxford) Distributed Database Operations Workshop CERN: November 26-27, 2009.
F Fermilab Database Experience in Run II Fermilab Run II Database Requirements Online databases are maintained at each experiment and are critical for.
A tool to enable CMS Distributed Analysis
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
AMI S.A. Datasets… Solveig Albrand. AMI S.A. A set is… A number of things grouped together according to a system of classification, or conceived as forming.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
ATLAS Data Periods in COMA Elizabeth Gallas - Oxford ATLAS Software and Computing Week CERN – April 4-8, 2011.
ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial.
Introduction: Distributed POOL File Access Elizabeth Gallas - Oxford – September 16, 2009 Offline Database Meeting.
Alignment Strategy for ATLAS: Detector Description and Database Issues
LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.
30 Jan 2009Elizabeth Gallas1 Introduction to TAGs Elizabeth Gallas Oxford ATLAS-UK Distributed Computing Tutorial January 2009.
Time and storage patterns in Conditions: old extensions and new proposals António Amorim CFNUL- FCUL - Universidade de Lisboa ● The “Extended”
Software Solutions for Variable ATLAS Detector Description J. Boudreau, V. Tsulaia University of Pittsburgh R. Hawkings, A. Valassi CERN A. Schaffer LAL,
ATLAS Database Operations Invited talk at the XXI International Symposium on Nuclear Electronics & Computing Varna, Bulgaria, September 2007 Alexandre.
CHEP 2006, Mumbai13-Feb-2006 LCG Conditions Database Project COOL Development and Deployment: Status and Plans Andrea Valassi On behalf of the COOL.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
ATLAS Detector Description Database Vakho Tsulaia University of Pittsburgh 3D workshop, CERN 14-Dec-2004.
ATLAS applications and plans LCG Database Deployment and Persistency Workshop 17-Oct-2005,CERN Stefan Stonjek (Oxford), Torre Wenaus (BNL)
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
The Persistency Patterns of Time Evolving Conditions for ATLAS and LCG António Amorim CFNUL- FCUL - Universidade de Lisboa A. António, Dinis.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
1 Database mini workshop: reconstressing athena RECONSTRESSing: stress testing COOL reading of athena reconstruction clients Database mini workshop, CERN.
3rd November Richard Hawkings Luminosity, detector status and trigger - conditions database and meta-data issues  How we might apply the conditions.
Conditions Metadata for TAGs Elizabeth Gallas, (Ryan Buckingham, Jeff Tseng) - Oxford ATLAS Software & Computing Workshop CERN – April 19-23, 2010.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Jean-Roch Vlimant, CERN Physics Performance and Dataset Project Physics Data & MC Validation Group McM : The Evolution of PREP. The CMS tool for Monte-Carlo.
New COOL Tag Browser Release 10 Giorgi BATIASHVILI Georgian Engineering Center 23/10/2012
3D Testing and Monitoring Lee Lueking LCG 3D Meeting Sept. 15, 2005.
David Adams ATLAS Datasets for the Grid and for ATLAS David Adams BNL September 24, 2003 ATLAS Software Workshop Database Session CERN.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
11th November Richard Hawkings Richard Hawkings (CERN) ATLAS reconstruction jobs & conditions DB access  Conditions database basic concepts  Types.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.
Utility of collecting metadata to manage a large scale conditions database in ATLAS Elizabeth Gallas, Solveig Albrand, Mikhail Borodin, and Andrea Formica.
Vincenzo Innocente, CERN/EPUser Collections1 Grid Scenarios in CMS Vincenzo Innocente CERN/EP Simulation, Reconstruction and Analysis scenarios.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
ELSSISuite Services QIZHI ZHANG Argonne National Laboratory on behalf of the TAG developers group ATLAS Software and Computing Week, 4~8 April, 2011.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.
Conditions Metadata for TAGs Elizabeth Gallas, (Ryan Buckingham, Jeff Tseng) - Oxford ATLAS Software & Computing Workshop CERN – April 19-23, 2010.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
M.Frank, CERN/LHCb Persistency Workshop, Dec, 2004 Distributed Databases in LHCb  Main databases in LHCb Online / Offline and their clients  The cross.
Maria del Carmen Barandela Pazos CERN CHEP 2-7 Sep 2007 Victoria LHCb Online Interface to the Conditions Database.
ATLAS The ConditionDB is accessed by the offline reconstruction framework (ATHENA). COOLCOnditions Objects for LHC The interface is provided by COOL (COnditions.
Online DBs in run Frank Glege on behalf of several contributors of the LHC experiments.
VI/ CERN Dec 4 CMS Software Architecture vs Hybrid Store Vincenzo Innocente CMS Week CERN, Dec
ATLAS TAGs: Tools from the ELSSI Suite Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial Edinburgh, UK – March 21-22, 2011.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Dario Barberis: ATLAS DB S&C Week – 3 December Oracle/Frontier and CondDB Consolidation Dario Barberis Genoa University/INFN.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
Database Replication and Monitoring
ALICE analysis preservation
Database operations in CMS
AMI – Status November Solveig Albrand Jerome Fulachier
Conditions Data access using FroNTier Squid cache Server
Data Lifecycle Review and Outlook
Elizabeth Gallas, Solveig Albrand, Mikhail Borodin, and Andrea Formica
ATLAS DC2 & Continuous production
Presentation transcript:

ATLAS Databases: An Overview, Athena use of Geometry/Conditions DB, and Conditions Metadata Elizabeth Gallas - Oxford ATLAS-UK Distributed Computing Tutorial Edinburgh, UK – March 21-22, 2011

Mar 2011Elizabeth Gallas - Databases/COMA2 Outline  Motivation: Databases  Overview of ATLAS Databases  Databases of Athena-based analysis interest  Geometry Database  Conditions Database And how they are made accessible on the grid  COMA (Conditions Metadata)  Selected/derived Run/Lb-wise Conditions/Configuration  in relational format  Data Periods in COMA  Other COMA reports  Summary and Conclusions

Mar 2011Elizabeth Gallas - Databases/COMA3 Motivation: Database use in ATLAS  ATLAS “data” – falls into 2 broad categories  Event-wise data: stored in files (RAW, ESD, AOD, TAG …)  Know something about themselves but also have some ‘metadata’ pointers to the bigger picture  Non-event-wise data: Stored in Databases  Enable construction of the ‘bigger picture’  Important information needed at our fingertips  Usually by diverse clients  Data Base Management Systems (DBMS) provide:  persistent storage  for large/small collections of data of varied complexity  in data structures that provide access flexibility  powerful query language  for data entry, modification and retrieval  transaction management  appearance of isolation  but provides multi-user simultaneous access

Mar 2011Elizabeth Gallas - Databases/COMA4 Overview – Oracle usage in ATLAS Oracle is used extensively: every stage of data taking, processing, analysis. Some of the more common applications:  Configuration  PVSS – Detector Control System (DCS) Configuration & Monitoring  Trigger – Trigger Configuration (online and simulation data)  OKS – Configuration databases for the TDAQ  Geometry - Detector Description  File and Job management  T0 – Tier 0 processing  DQ2/DDM – distributed file and dataset management  Dashboard – monitor jobs and data movement on the ATLAS grid  PanDa – workload management: production & distributed analysis  Conditions data (non-event data for offline analysis)  Conditions Database  [POOL files in DDM (referenced from the Conditions DB)]  “Metadata” == data about data  AMI (ATLAS Metadata Interface) – Dataset metadata  COMA (COnditions MetadatA) – Configuration/Conditions metadata  TAGs (not an acronym) – Event-level metadata

Mar 2011Elizabeth Gallas - Databases/COMA5 What does your Athena job need ? What does every Athena job need ? 1.Data (Events) 2.Database (Geometry, Conditions) 3.Efficient I/O (sometime across a network), CPU 4.(A Purpose and a) Place for Output  Next slides … more details about Geometry and Conditions  What they contain  How Athena accesses them  How they are distributed for access on the grid  User interfaces, documentation, and help Needs: 1.Food 2.Water 3.Love 4.Place for output

Mar 2011Elizabeth Gallas - Databases/COMA6 Geometry Database  Relational DB: Primary Numbers for the ATLAS Detector Description  All data for building GeoModel description in single place  Primary numbers stored in Data Tables (leaf)  Organized by subsystem (branch)  Tagging (versioning) at various levels  Locked tags define distinct detector description  And Globally tagged/locked at higher levels  Associated with Software Releases  Evolution of Geometry tags is set up such that  Each new tag is compatible with older Releases  Location and Distribution:  Master copy: in Oracle server at CERN  Up to now: Copy of entire database dumped into SQLite file  Delivered to sites using DB Release technology with each Software Release  Future … more diverse distribution model being tested (Frontier)  Update: (Vakho Tsulaia) in upcoming Software/Computing workshop

Mar 2011Elizabeth Gallas - Databases/COMA7 Geometry DB Browser

Mar 2011Elizabeth Gallas - Databases/COMA8 “Conditions” “Conditions” – general term for information which is not ‘event-wise’ reflecting the conditions or states of a system – conditions are valid for an ‘interval of validity’ (IOV) ranging from very short to infinity. IOV’s can be expressed as a range: in timestamps or Run/LumiBlocks. Any conditions data needed for offline processing and/or analysis must be stored in the ATLAS Conditions Database (aka: COOL) or in its referenced POOL files (DDM) ATLAS Conditions Database ZDC DCS TDAQ OKS LHC DQ

Mar 2011Elizabeth Gallas - Databases/COMA9 Conditions DB infrastructure in ATLAS  Relies on considerable infrastructure: COOL, CORAL, Athena (developed by ATLAS and CERN IT) -- generic schema design which can store / accommodate / deliver a large amount of data for diverse set of subsystems.  IOV ‘interval of validity’ DB in relational DB tables  Data organized into folders … foldersets  By schema (subdetector)  By instance (for real data and MC)  Stores data ‘inline’ but can have references to external POOL files (managed by DDM)  Athena / Conditions DB  data maps to transient C++ objects, which are accessible to Athena at run time through the Transient Store  COOL Tag (version) - distinct sets of Conditions  making specific computations reproducible  Used at many stages of data taking and analysis From online calibrations, alignment, monitoring, to offline … processing … more calibrations … further alignment… reprocessing … analysis …to luminosity and data quality

Mar 2011Elizabeth Gallas - Databases/COMA10 Conditions: User interfaces Command line interface:  Conditions TAG Browser: 

Mar 2011Elizabeth Gallas - Databases/COMA11 Oracle Distribution of Conditions data  Oracle stores a huge amount of essential data ‘at our fingertips’  But ATLAS has many… many… many… fingers  May be looking for oldest to newest data  Conditions in Oracle – Master copy at Tier-0  Replicated to many Tier-1 sites  Running jobs at Oracle sites (direct access) performs well  But direct Oracle access on the grid from remote sites:  Even after tuning, direct access requires many back/forth network transactions – RTT (Round Trip Time) multiplies … SLOW  Cascade effect: Jobs hold connections longer, prevents starting new jobs  Use alternative technologies, especially over WAN (Wide Area Network):  “caching” Conditions from Oracle when possible Online CondDB Offline master CondDB Tier-1 replica Tier-1 replica Tier-0 farm Computer centre Outside world Isolation / cut Calibration updates Simplified Diagram !

Mar 2011Elizabeth Gallas - Databases/COMA12 Technologies for Conditions “caching”  “DB Release”: make a system of files containing all data ‘needed’.  Used in reprocessing campaigns and for MC processing/analysis  Includes:  SQLite replicas: “mini” Conditions DB  with specific Folders, IOV range, CoolTag  (a ‘slice’ – small subset of all rows in Oracle tables)  And associated POOL files and a PFC (file catalog)  “Frontier”: store results in a web cache.  Developed by Fermilab ( used by CDF, further refined for CMS)  Basic Idea: Frontier / Squid servers located at/near Oracle RAC  negotiate transactions between grid jobs and Oracle DB  reduce the load on Oracle by caching results of repeated queries  reduce latency observed connecting to Oracle over the WAN.  Additional Squid servers at remote sites help even more  Used by default for user analysis jobs.  Picture on next slide

Mar 2011Elizabeth Gallas - Databases/COMA13 Conditions DB access via Frontier Frontier for distributed database access  Used by default for user analysis jobs. Main components  Frontier server  Communicates directly with Oracle server  Includes data caching  Provides data to Squids  Squid  Communicates with Frontier server over http  Caches retrieved data locally for its clients ATLAS: Frontier in operation late in 2009  Frontier servers at T1 sites on replication  ~60 Squids all over the world  Mostly T2, some T3 too Tier 2 Tier 1

Mar 2011Elizabeth Gallas - Databases/COMA14 DB Access in Athena  Athena applications access conditions and geometry DBs using LCG software libraries POOL, COOL and CORAL  Allows for transparent usage of various technologies (Oracle, SQLite, FroNTier/Squid)

Mar 2011Elizabeth Gallas - Databases/COMA15 Tips for Users (1)  What Global Conditions and Geometry tags to use?  Autoconfigure your job  Have job read global tags from its input file (ESD, AOD)  In job options: from RecExConfig.RecFlags import rec rec.AutoConfiguration=['everything']  In job transforms: Command line parameter 'autoConfiguration=everything' Slide: V.Tsulaia

Mar 2011Elizabeth Gallas - Databases/COMA16 Tips for Users (2)  How to configure my environment to access  FroNTier/Squid?  Conditions payload POOL files?  DB Release for geometry (and MC conditions if needed)?  All that is done for you automatically... … just sit back and enjoy the ride! Slide: V.Tsulaia

Mar 2011Elizabeth Gallas - Databases/COMA17 Tips for Users (3) If things go wrong … and it seems to be related to database access Useful information on TWiki:  Athena DB Access:  COOL Troubles:  Atlas DB Release: These TWiki documents should be able to help you in narrowing down the problem and then you'll be in position to  Either ask your site admin  Or send to Database Operations Slide: V.Tsulaia

Mar 2011Elizabeth Gallas - Databases/COMA18 Conclusions: Databases and DB Access from Athena  Databases are used extensively in ATLAS  At every stage of data taking, processing, analysis  Scratch the surface of many interactive user applications  And you will find a Database !  I’ve attempted to give an overview of the issues and considerations in DB access from Athena  The need to provide database information  In a variety of access patterns  With potentially widely varying data volumes  From diverse clients makes Athena access to ATLAS non-event-wise databases (Conditions and Geometry) complex.  Supporting different technologies  allows us to optimally meet the various needs.  A lot of effort has gone into making DB access for user analysis as transparent as possible …  More details can be found:  See V.Tsulaia slides  Software Workshop in Tbilisi Oct 26, 2010  On various TWiki pages