Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS Analysis Overview Eric Torrence University of Oregon/CERN 10 February 2010 Atlas Offline Software Tutorial.

Similar presentations


Presentation on theme: "ATLAS Analysis Overview Eric Torrence University of Oregon/CERN 10 February 2010 Atlas Offline Software Tutorial."— Presentation transcript:

1 ATLAS Analysis Overview Eric Torrence University of Oregon/CERN 10 February 2010 Atlas Offline Software Tutorial

2 Eric TorrenceFebruary 2010 2 Outline This talk is intended to give a broad outline for how analyses within Atlas should be done for first data. Technical details will be covered by other speakers ATLAS data types and locations Derived samples (dAOD/dESD) Data sample selection Luminosity

3 AMFY Task Force J. Boyd, A. Gibson, S. Hassani, R. Hawkings, B. Heinemann, S. Paganis, J. Stelzer, T. Wengler Analysis Model for the First Year Final report October 2009 See talk by Thorsten Wengler at Barcelona Atlas Week http://indico.cern.ch/materialDisplay.py?contribId=73&sess ionId=11&materialId=slides&confId=47256 http://indico.cern.ch/materialDisplay.py?contribId=73&sess ionId=11&materialId=slides&confId=47256 Or the final AMFY Report http://cdsweb.cern.ch/record/1223952

4 Eric TorrenceFebruary 2010 4 Analysis Data Formats RAW - specialized detector studies (e.g.: alignment) ESD - detector performance studies AOD - starting point for physics analyses (~1/10 of ESD) All events processed from RAW to AOD Processing separated by trigger-based data STREAM Egamma jet - Tau - MET Muon Minbias Bphysics Streams are inclusive, e.g. : All electron triggers written to e/gamma stream Physics Streams RAWAOD ESD AOD

5 Eric TorrenceFebruary 2010 5 Derived Samples AOD ESD AOD RAW Derived samples (dESD, dAOD) aim to reduce sample size further by removing events (skimming) and/or removing info (slimming/thinning) Many versions/variants possible for specific uses Skimming based on offline quantities and/or trigger dAOD dESD AOD Specialised RAW samples AODFix Central Production Group Production Small Group/User dAOD (Physics DPD) - e.g.: diMuon (2 mu, p T > 15 GeV) dESD (Perf. DPD) - e.g.: eGamma (CaloCells near e/gamma triggers) Currently popular: dESD_COLLCAND

6 Eric TorrenceFebruary 2010 6 User Data AOD ESD AOD RAW dAOD dESD AOD AODFix ntuple ESD, AOD, dESD - Stored on Grid in ATLAS space production managed centrally dAOD, ntuples - Stored in group disk or local user space managed by groups/users Each group has a grid space manager and a production manager. Good to find out who this person is... Performance Analysis Physics Analysis

7 Eric TorrenceFebruary 2010 7 Databases and Metadata Oracle server at nearest Tier-1 Tier-2 laptop desktopTier-3 Analysis model assumes any Athena job (ESD/dESD, AOD, dAOD) needs access to conditions data (e.g.: COOL) Detector geometry, conditions, alignments Data quality and trigger configuration metadata InSituPerformance information Dataset information (AMI) TAG database (TAG) Even a purely ntuple-based analysis may need some of this. Potential bottleneck, hassle for end users Running on MC typically easier than data Be aware of external files referenced by DB...

8 Eric TorrenceFebruary 2010 8 Database Access Client machine ATHENA COOL CORAL FroNTier client squid Tomcat FroNTier servlet Oracle DB server squid Server site IOVDbSvc Get File From ROOT Via POOL OraclesqliteMySQL ATLAS LCG David Front Client Machine Database Server Query/Reply Path Primary information source: Oracle server (Tier 1) Don’t want all users to hammer away on these machines Solution: FroNTier server/squid cache Being tested/deployed now across Atlas - need operational experience Topic for future tutorial!

9 Eric TorrenceFebruary 2010 9 Selecting your data Time range/machine conditions  Collision Energy: 7 TeV  All 2010 data approved for Summer conferences Detector/offline data quality (DQ flags)  Data periods with good electrons in barrel  Data with pixels turned on Trigger Configuration  EF_e20_loose active  Both e20 and mu10 active and unprescaled All analyses must define their data sample Key ingredient in defining luminosity

10 Eric TorrenceFebruary 2010 10 Luminosity Blocks ATLAS runs are subdivided into Luminosity Blocks (~1 min) LB is the atomic unit for selecting a data sample Most conditions are mapped to specific luminosity blocks  Trigger pre-scales (can only change at LB boundary)  Data Quality flags (mapped to LBs offline) Luminosity can only be determined for a specific set of runs/luminosity blocks Specifying your data sample functionally means specifying a list of runs/LBs to analyze. aka: GoodRunList Run 165789 Lumi Block 1 2 3 4 5 6 7 8 9 10

11 Eric TorrenceFebruary 2010 11 Data Quality Flags DQ Flags are simple indicators of data quality, but many many sources (I counted 102...)  Detectors (divided by barrel, 2 endcaps)  Trigger (by slice)  Offline combined perf. (e/mu/tau/jet/MET/...) Physics analyses should not arbitrarily choose DQ flags Combined performance groups will define a recommended set of DQ flag criteria for physics objects (e.g.: Barrel electrons or forward jets) - soon with virtual flags Users in their working group will decide which set of (virtual) DQ flags is needed for each analysis, which can then be used to generate a GoodRunList Standard lists will be centrally produced by DQ group

12 Eric TorrenceFebruary 2010 12 Reprocessing Data processed at Tier 0 (right off the detector) have preliminary calibrations and first-pass DQ flags Any physics results must start with reprocessed data  Updated calibrations  Consistent release/bug fixes DQ flags are re-evaluated, tagged and locked after each reprocessing - flags fixed, but must use correct tag! Important to always access DQ information from the COOL database, using the tag appropriate to your data processing version, or else you won’t get consistent results Best to use officially generated (static) Good Run Lists appropriate for your reprocessed data

13 Eric TorrenceFebruary 2010 13 Luminosity Calculation In ATLAS, Lumi is calculated for a specific set of LBs and includes LB-dependent corrections for  Trigger prescales  L1 Trigger Deadtime The user derived ε sel must contain all other event- dependent efficiencies, including  Unprescaled trigger efficiency (vs. p T for example)  Skimming efficiency (in dAOD production)  TAG selection efficiency (in TAG-based analyses)  Event selection cuts σ = (N sel - N bgd )/( ε sel Lumi) Final luminosity can only be calculated after full GRL and trigger is specified

14 Eric TorrenceFebruary 2010 14 Example Physics Analysis Models Direct AOD/dAOD  User submitted Grid job run over all AODs or group-generated dAOD/dESD samples  Final user ntuples produced directly TAG-based analysis  Start with TAG-selected sample (see TAG tutorial)  Saves having to run over large datasets Ntuple-based analysis  Start with general purpose group ntuple where all data selection criteria haven’t been applied  Probably most complicated, but tools support this also Will concentrate on this, other cases are not so different

15 Eric TorrenceFebruary 2010 15 AOD/dAOD analysis Pre-run query based on input parameters already defines GRL (instantiated as an XML file) atlas-runquery (or TAG browser) currently best way to do this - eventually most users will start with pre-made lists Can find ‘expected’ Luminosity already without running a single job Needs COOL access, but can just be run once, XML GRL files can be transferred to your local area/laptop Time Range DQ query Trigger,... Data sample Definition RunQuery GRL (XML File) COOL LumiCalc COOL Expected Luminosity Pre-run query

16 Eric TorrenceFebruary 2010 16 AOD/dAOD analysis No automatic tool (yet) to convert GRL to AOD DataSet (although TAG does this for you) Using standard tools, you will get an output GRL - very useful to compare against input for cross-checks Copy of GRL will also be saved to your output ntuple if made with the PAT Ntuple Dumper* Input GRL (XML File) DataSet (AOD files) Distributed Analysis AMI Ganga/ pAthena job out Output Merge Output GRL (XML File) Ntuple GRL The Grid

17 Eric TorrenceFebruary 2010 17 AOD/dAOD analysis Comparing input and output XML files very useful to check for job failures and dataset consistency Output shows exactly which LBs were analyzed  run LumiCalc directly on ntuple (with COOL access)  extract XML file, copy to CERN, and run LumiCalc there Input GRL (XML File) Bookkeeping Ntuple GRL (XML File) Ntuple GRL Output GRL (XML File) Compare! LumiCalc COOL Observed Luminosity Ntuple GRL LumiCalc COOL Observed Luminosity

18 Eric TorrenceFebruary 2010 18 Sparse Data Problem Even if LB selects no events, LB still contributes to Luminosity GRL selection tool outputs LB metadata (also in ntuple) even when no events selected Derived (dAOD) samples must also obey this requirement Must merge all job output to correctly include all LBs considered working to include this in standard distributed analysis merger job 1 job 2 job 3 Grid JobOutput 1 ev 0 ev Input LB: 1-3 LB: 4-5 LB: 6-8crashed Total LBs analyzed:1-5

19 Eric TorrenceFebruary 2010 19 Other use cases Easiest to make entire selection (complete GRL specification) before launching AOD/dAOD job For many reasons, people may start with samples with partial (or no) GRL specifications applied  skimmed group dAODs  group ntuples  TAG-selected data* Scheme still works, as long as samples have been produced with the standard tools - metadata about all LBs contributing to the sample still present and consistent Users must still apply final GRL specification (including trigger) to create sample with well-defined luminosity Tools also work at the ntuple level, must save LB number

20 Eric TorrenceFebruary 2010 20 Conclusions Analysis model for the first year has undergone considerable discussion and development recently There may still be some rough edges General scheme for simple cases now exist Work ongoing to support more advanced use cases Tools available to make this easier to the end-user - many more technical details shown today There are many ways to screw this up, use central tools and central GRLs as much as possible If tools don’t seem do what you want, let a developer know!


Download ppt "ATLAS Analysis Overview Eric Torrence University of Oregon/CERN 10 February 2010 Atlas Offline Software Tutorial."

Similar presentations


Ads by Google