Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.

Similar presentations


Presentation on theme: "ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow."— Presentation transcript:

1 ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow

2 Tags What are tags? Why have them? When are they produced? Where are they? How can they be used?

3 What are Event Tags? Event-level metadata: summary information about events, with a “pointer” to the corresponding AOD/ESD/RDO format –Useful for selecting events for physics analysis –Should be no bigger than 1KB per event (~ 1% AOD size)

4 Why have Event Tags? To make Physicist’s life easier and analysis faster –Allows you to exclude uninteresting events from data sample used for analysis without searching through AOD/ESD files –Samples of specific interest to an analysis can be extracted into a smaller set of files for repeated running –Provides a global view of the data, useful for data mining Not to do analysis on directly

5 Tag Use Cases Some Physicist use cases: –Using official Tags with query in job options –Using local Tag “database” for preliminary analysis –Using global Tag database to look for events –Using global Tag database to build input list for Athena jobs

6 What do they look like? The LCG POOL infrastructure is used to store Tags –Hence use of “collection” terminology They exist in 2 forms: –ROOT files –Relational database (MySQL and Oracle) Why keep 2 forms? –ROOT files useful for local work –DB useful for queries, global view of data Tag content: collection information + event information

7 Tag Content Collection Information –Collection ID, AOD/ESD/RDO references Global Event Quantities –Event no., run no., no. of tracks, missing E T etc Trigger Decisions Electrons, Photons, Muons –Number, P T,  etc Jets, Taus –Number, P T, , etc

8

9 When are Tags produced? Written to ROOT files at Tier 0 during AOD production – “Explicit collections” Data then imported into central relational database (Oracle at CERN) Database replicated to Tier 1 and lower –Oracle where available; MySQL otherwise Users can create their own tag files

10

11 Sample Queries General Collection Information –How many events in collection A? –What are the names and types of Tag attributes? –What production task(s) produced these Tags? Content Queries –Give me all events with at least 2 electrons and missing ET > 10 GeV which are ‘good for physics’ Summary Queries –Give me the number of events for some content query –Give me sum of the luminosity for some content query

12 How can Tags be used? Collection tools Athena Tag Navigator Tool (TNT)

13 Collection Tools To use Tags in Athena, you need to know what the attributes are POOL Collection tools can be used for this –Can copy collections, append collections, print list of files used, etc –Allows queries on the input collections See Tutorial Exercises, part 1

14 Tags in Athena Both ROOT and Relational Tags can be read directly from Athena Need file catalogue to find the AOD files, and Athena version which matches that used by the Tags One can also produce private ROOT Tags from AOD Focus here is on reading, rather than building, Tags

15 Local Tag Files with Athena jobOptions for event selection look like:

16 Remote Tag Database with Athena Not many Tags available in central database yet –This constrains the exercises somewhat, but we can at least illustrate the principles jobOptions must include lines like: EventSelector.InputCollections = ['rome_4312_merge_H12_140_gamgam_AOD_tags’] EventSelector.Connection = 'oracle://atlas_tags/atlas_tags_rome’ EventSelector.CollectionType = 'ExplicitRAL'

17 Tag Navigator Tool (TNT) A utility which aims to allow ATLAS physicists to use the Tag database for analysis Runs a query on the database and outputs a local ROOT collection Divides this into a number of sub-collections Submits user jobs to LCG, one per sub-collection Output files can be registered as new DQ2 dataset

18 What’s there now? There is still a lot of work to be done to get an efficient Tag system running –Currently running performance / scalability tests on central database Need Tags to be produced and loaded into database as a matter of course Tag database from Rome workshop is still there, now awaiting Tags from Streaming Tests

19 And finally… Tags will become ever more useful as real data appears Infrastructure is still being developed Wednesday’s exercises aimed at familiarisation with ideas and methods


Download ppt "ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow."

Similar presentations


Ads by Google