ATLAS Peter van Gemmeren (ANL) for many. xAOD  Replaces AOD and DPD –Single, EDM – no T/P conversion when writing, but versioning Transient EDM is typedef.

Slides:



Advertisements
Similar presentations
System Integration and Performance
Advertisements

Paging: Design Issues. Readings r Silbershatz et al: ,
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CS-334: Computer.
ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT )
1 Computer System Overview OS-1 Course AA
Chapter 8 Operating System Support
Chapter 1 and 2 Computer System and Operating System Overview
Computer Organization and Architecture
Memory Management April 28, 2000 Instructor: Gary Kimura.
A. Frank - P. Weisberg Operating Systems Evolution of Operating Systems.
CHEP ' 2003David Chamont (CMS - LLR)1 Twelve Ways to Build CMS Crossings from Root Files Benefits and deficiencies of Root trees and clones when : - NOT.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.
CS364 CH08 Operating System Support TECH Computer Science Operating System Overview Scheduling Memory Management Pentium II and PowerPC Memory Management.
Layers and Views of a Computer System Operating System Services Program creation Program execution Access to I/O devices Controlled access to files System.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
Filesytems and file access Wahid Bhimji University of Edinburgh, Sam Skipsey, Chris Walker …. Apr-101Wahid Bhimji – Files access.
XAOD persistence considerations: size, versioning, schema evolution Joint US ATLAS Software and Computing / Physics Support Meeting Peter van Gemmeren.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Wahid Bhimji University of Edinburgh J. Cranshaw, P. van Gemmeren, D. Malon, R. D. Schaffer, and I. Vukotic On behalf of the ATLAS collaboration CHEP 2012.
Chapter 5 Operating System Support. Outline Operating system - Objective and function - types of OS Scheduling - Long term scheduling - Medium term scheduling.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Processes and OS basics. RHS – SOC 2 OS Basics An Operating System (OS) is essentially an abstraction of a computer As a user or programmer, I do not.
Developer workshop on I/O and persistence evolution LAL,Orsay, Feb 2012 Marcin Nowak (PAS BNL) Extended T/P Converters.
 Optimization and usage of D3PD Ilija Vukotic CAF - PAF 19 April 2011 Lyon.
Session objectives Discuss whether or not virtualization makes sense for Exchange 2013 Describe supportability of virtualization features Explain sizing.
David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.
Server Virtualization
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
Lecture 5 Page 1 CS 111 Online Processes CS 111 On-Line MS Program Operating Systems Peter Reiher.
I/O Infrastructure Support and Development David Malon ATLAS Software Technical Interchange Meeting 9 November 2015.
Moritz Backes, Clemencia Mora-Herrera Département de Physique Nucléaire et Corpusculaire, Université de Genève ATLAS Reconstruction Meeting 8 June 2010.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
Multilevel Caches Microprocessors are getting faster and including a small high speed cache on the same chip.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
The JANA Reconstruction Framework David Lawrence - JLab May 25, /25/101JANA - Lawrence - CLAS12 Software Workshop.
9/28/2005Philippe Canal, ROOT Workshop TTree / SQL Philippe Canal (FNAL) 2005 Root Workshop.
I/O Strategies for Multicore Processing in ATLAS P van Gemmeren 1, S Binet 2, P Calafiura 3, W Lavrijsen 3, D Malon 1 and V Tsulaia 3 on behalf of the.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Data processing Offline review Feb 2, Productions, tools and results Three basic types of processing RAW MC Trains/AODs I will go through these.
Summary of persistence discussions with LHCb and LCG/IT POOL team David Malon Argonne National Laboratory Joint ATLAS, LHCb, LCG/IT meeting.
Analysis Performance and I/O Optimization Jack Cranshaw, Argonne National Lab October 11, 2011.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
Next-Generation Navigational Infrastructure and the ATLAS Event Store Abstract: The ATLAS event store employs a persistence framework with extensive navigational.
AliRoot survey: Reconstruction P.Hristov 11/06/2013.
I/O aspects for parallel event processing frameworks Workshop on Concurrency in the many-Cores Era Peter van Gemmeren (Argonne/ATLAS)
I/O and Metadata Jack Cranshaw Argonne National Laboratory November 9, ATLAS Core Software TIM.
 IO performance of ATLAS data formats Ilija Vukotic for ATLAS collaboration CHEP October 2010 Taipei.
Functions of Operating Systems V1.0 (22/10/2005).
POOL Based CMS Framework Bill Tanenbaum US-CMS/Fermilab 04/June/2003.
Mini-Workshop on multi-core joint project Peter van Gemmeren (ANL) I/O challenges for HEP applications on multi-core processors An ATLAS Perspective.
Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))
HYDRA Framework. Setup of software environment Setup of software environment Using the documentation Using the documentation How to compile a program.
Peter van Gemmeren (ANL) Persistent Layout Studies Updates.
ROOT IO workshop What relates to ATLAS. General Both CMS and Art pushing for parallelization at all levels. Not clear why as they are anyhow CPU bound.
Atlas IO improvements and Future prospects
William Stallings Computer Organization and Architecture
MANAGING DATA RESOURCES
Lecture Topics: 11/1 General Operating System Concepts Processes
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Presentation transcript:

ATLAS Peter van Gemmeren (ANL) for many

xAOD  Replaces AOD and DPD –Single, EDM – no T/P conversion when writing, but versioning Transient EDM is typedef of most recent persistent EDM Versioning, similar to ‘_p ’, but ‘_v ’ –Limited support for schema evolution –Readable in Athena and outside of Athena with a small amount of libraries loaded –xAOD files are browse-able in TBrowser without EDM libraries loaded 2/25/2014 Peter van Gemmeren (ANL): AOD format Task Force 1 status 2

The xAOD design  The new xAOD object has 3 components: 1.xAOD interface class Front end for the user, proper C++ object –May be without any data members 2.Auxiliary Store - static data container Predefined C++ type (similar to egDetails…) –Has a dictionary 3.Auxiliary Store - dynamic extension Dynamic structure, attributes can be added at any time –Extension to DataVector –No dictionary – needs special handling in the Persistency layer! » The (former POOL) RootStorageSvc will assign a TBranch to each attribute. 2/25/2014 Peter van Gemmeren (ANL): AOD format Task Force 1 status 3

2/25/2014 Peter van Gemmeren (ANL): AOD format Task Force 1 status 4 ROOT browsing

 xAOD EDM is done and will have some influence on performance, especially I/O.  ROOT provides another layer of optimization that is crucial to I/O performance and has not yet been studied for xAOD: Did so for AOD some years ago and saw good results –Write optimization parameters such as: Streaming/Splitting, Basket Size/Auto_Flush, Compression… –Read optimization with: TTreeCache  Goal of performance tests will be to find good settings  xAOD format is used at different stages of the workflow: –Primary xAOD (replacement of AOD) as output of Reconstruction. –Derived xAOD (~replacement of D3PD) as output of TF2 derivation framework. Expect use-cases (and storage layout) to be different for Primary and Derived xAOD. And don’t forget upstream consequences of xAOD –E.g.: ESD/AOD shared data types! 2/25/2014 Peter van Gemmeren (ANL): AOD format Task Force 1 status 5 Performance metrics

Branches, Baskets and Compression for xAOD  Primary xAOD in has 2,119 Branches/Leaves: –204 core and other objects –1,066 for concrete Auxiliary store objects 50 objects, currently fully split *, but [c|sh]ould be streamed member-wise (at least for primary data) –849 for dynamic Auxiliary store attributes Unfortunately these cannot be reduced.  Each Leaf has its own basket for compression and their size was optimized with auto_flush = 10, to hold a small number of events. –Good for primary data and event-wise reading. –Virtual memory needed for 10 events of decompressed data: 4 MB for dynamic store, 33.5 MB for concrete store.  Compression factor for typical data is 3 – 4, anything much higher than that indicates redundant data, wastes CPU and memory. –12 branches with compression > 100! Needs fixing (e.g. each takes ~0.5 MB VMEM), but number seems to have come down. 6/04/2014 Peter van Gemmeren (ANL): Event Sizes: other aspects of xAOD 6

Potential Concerns, Or Opportunities for Enhancement :-)  Too many Branches could hurt read speed! –W/o cache or during learning phase each branch is equal to a disk read. Caching kind of counteracts on demand retrieval of StoreGate  Large Baskets consume lots of VMEM. –Surprised to find that even for small auto_flush of 10 events, total basket size was 40 MB!  Highly compressible data will cause large Baskets, use up memory and CPU time. –Especially for downstream/analysis xAOD, where the auto_flush setting will be larger. –For combined ntuple writing they also seemed responsible for large jumps in memory consumption.  Changes to xAOD have significant effect on ESD and need to be studied here as well. 6/04/2014 Peter van Gemmeren (ANL): Event Sizes: other aspects of xAOD 7

Priorities for future ROOT Enhancements: Caching  ATLAS data (especially xAOD) has to efficiently support multiple use-cases –Ranging for selective row-wise access (e,g., AthenaMP), to sequential row-wise access (e.g., production, TF2 derivation) to many column (e.g., Analysis) and few column-wise access (e.g., histogramming).  Caching is an important ingredient to perform all these operations on the same data file and ATLAS has seen huge benefits from TTreeCache. –Work by DavidS on enabling cache by environment will ensure to maximize benefits.  The following are ideas to enhance caching capabilities in ROOT –Read: Hierarchical Cache: –long columns for few event/entry selection attributes in TTree, shorter columns for rest of accessed branches. Smart (object aware) Caching? –Write: Auto-Flush and cache for writing IlijaV had provided better basket size algorithm, which should be evaluated / implemented.

Priorities for future ROOT Enhancements: Compression  Disk Storage is already the biggest ATLAS computing expense and the main cost driver in any future upgrade. Decisions on reducing data content are unavoidable, but painful due to their impact on physics (and/or workflow).  Compression is important for ATLAS to reduce redundancy in the data (current data has a compression factor of about 3 – 4). –ROOT Double32_t is being investigated to help restore some of the loss in storage efficiency by dropping T/P layer Some double attributes in ATLAS EDM should be handled as Double32_t. Need equivalent support for Float16_t as well –Support for varInt – like data type –ATLAS has not yet systematically studied LZMA versus current ROOT-provided compression

Error handling and related topics  Trying to improve robustness and error handling wherever possible, support intelligent retry, and ensure that errors do not go undetected/unreported  The latter is one component of a broader effort to improve monitoring and information flow among ATLAS layers  ATLAS is relying increasingly on robust wide-area data access for analysis  Have seen cases of undetected I/O errors –Andy H has pointed out examples of ROOT not checking for errors reported by plugin (xrootd) layers –And ATLAS code does not always check GetEntry() return codes  There are circumstances in which it is possible that an error in one of N user- submitted jobs might go unnoticed unless she/he reads log files  Possibly deep stack of components –Example: xrootd reporting to ROOT reporting to APR (former POOL layer) reporting to EventLoop or EventSelector or something else reporting to transform reporting to pilot reporting to …  Aside: ROOT tutorial and other examples tend not to check GetEntry() return codes –What is standard practice here?

Outlook  ATLAS switched to new xAOD data model, combining AOD and DPD –Simpler data model –More branches –No custom compression (T/P layer)  Important future enhancements in the areas of: –Caching –Compression –Error handling / Robustness  ATLAS is willing to provide effort to help with the development of some of the new features, where possible