Presentation is loading. Please wait.

Presentation is loading. Please wait.

Conditions and configuration metadata for the ATLAS experiment E J Gallas 1, S Albrand 2, J Fulachier 2, F Lambert 2, K E Pachal 1, J C L Tseng 1, Q Zhang.

Similar presentations


Presentation on theme: "Conditions and configuration metadata for the ATLAS experiment E J Gallas 1, S Albrand 2, J Fulachier 2, F Lambert 2, K E Pachal 1, J C L Tseng 1, Q Zhang."— Presentation transcript:

1 Conditions and configuration metadata for the ATLAS experiment E J Gallas 1, S Albrand 2, J Fulachier 2, F Lambert 2, K E Pachal 1, J C L Tseng 1, Q Zhang 3 1. Introduction to COMA The COMA System (Conditions/Configuration Metadata for ATLAS), has been developed to make globally important run-level metadata more readily accessible. It is based on a relational database storing directly extracted, refined, reduced, and derived information from system-specific data sources as well as information from non-database sources. This information facilitates a variety of unique dynamic interfaces and provides information to enhance the functionality of other systems. COMA is one of the 3 dedicated metadata repositories in ATLAS fitting nicely between the AMI and TAG databases, which store metadata at the dataset and event-levels, respectively. A generalization of the data sources of each of these metadata repositories is shown below: COMA data sources include: –Conditions database: A wide variety of configuration information and measured conditions at the Run and Luminosity Block (LB) levels. An ATLAS Run is a interval of data taking (generally for many hours) with a fixed configuration, with selected configurations allowed to change at the sub-Run (or LB) level. –Trigger database: Trigger specific information not readily accessible or available via the Conditions database. –Tier-0 and TAG Catalog databases: Information about the processing of Runs and for the filtering of Runs into COMA: Which Runs are of “analysis” interest and available in the TAG database. –AMI database: AMI and COMA systems work symbiotically to store a variety of information about collections of Runs and their processing, making the data available to both systems. For Montecarlo datasets, COMA gets keys from AMI to identify trigger configurations in the Trigger Database used for MC simulation. –Other: Information from a variety of non-database sources: TWiki and other documentation, text and xml files, human entry. 3. Overview of the COMA Schema: T0M (Tier 0 Management) determines which datasets go to AMI. AMI looks every 60 seconds. 2. The AMI Framework Task Server Data is entered by a set of specialized tasks controlled by the task server. Information is available from different sources in a chaotic way. The task server imposes time-sharing. –One cannot allow an sudden peak in production tasks finishing to allow a backlog of input from Tier0 to develop. –Little and very often is best. Some tasks also must store a "stop point" which is usually the data source time stamp of the time when the last AMI read was successful. 4. Production System : Timestamp mechanism Read everything greater than lastUpdateTime and TaskNumber. –Reader (AMI) must decide what is relevant. –"Secure programming" = be ready for surprises.. 6. COMA : Symbiosis AMI and COMA "think" they are part of the same application. Parts of COMA were rendered "AMI Compliant" COMA has benefited from the AMI infrastructure, in particular pyAMI, the web service client. AMI writes some aggregated quantities in the COMA DB. AMI has benefited from the access to Conditions Data. 5. DDM : Publish/Subscribe Registration & Deletion of data using Active MQ/Stomp "Publish/subscribe" protocol. Very reliable, a few problems when the production of messages has a peak. 8. Some Examples of Derived Quantities Complete dataset provenance. (T0 & ProdDB) Number of files and events available in the dataset is updated every time fresh information arrives. (T0, ProdDB, DDM) Production Status. (T0, ProdDB, DDM) Average, min and max cross section recorded for the simulated, transported down the production chain. (ProdDB) Lost luminosity blocks in reprocessed data. (T0, ProdDB, DDM) Run period reprocessing errors (ProdDB & COMA) Datasets in run periods. (COMA) 7. Is AMI loading scalable? Insertion in AMI is longer than pure SQL insert operations on a database. –Many coherence checks, –derivation of quantities etc. Although almost all the time we have spare capacity we have observed backlogs from time to time – usually with massive numbers of finished production jobs arriving within a short period. Some obvious optimisations still not attempted. Scalable in medium term. Speed of treating Active MQ messages Department of Physics, Oxford University, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UNITED KINGDOM Laboratoire de Physique Subatomique et Corpusculaire, Université Joseph Fourier Grenoble 1, CNRS/IN2P3, INPG, 53 avenue des Martyrs, 38026, Grenoble, FRANCE (2) (1) (3) Argonne National Laboratory, High Energy Physics Division, Building 360, 9700 S. Cass Avenue, Argonne - IL 60439, United States of America AMI DBCOMA DB TAG DB TAG files DDM ProdSysTier 0 ConditionsTrigger TAG Catalog “other”


Download ppt "Conditions and configuration metadata for the ATLAS experiment E J Gallas 1, S Albrand 2, J Fulachier 2, F Lambert 2, K E Pachal 1, J C L Tseng 1, Q Zhang."

Similar presentations


Ads by Google