Statistical feature extraction, calibration and numerical debugging Marian Ivanov.

Slides:



Advertisements
Similar presentations
Programming Paradigms and languages
Advertisements

5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
Combined tracking based on MIP. Proposal Marian Ivanov.
Uncertainty Representation. Gaussian Distribution variance Standard deviation.
HLT - data compression vs event rejection. Assumptions Need for an online rudimentary event reconstruction for monitoring Detector readout rate (i.e.
O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
Reconstruction and Analysis on Demand: A Success Story Christopher D. Jones Cornell University, USA.
Algorithms and Methods for Particle Identification with ALICE TOF Detector at Very High Particle Multiplicity TOF simulation group B.Zagreev ACAT2002,
Software Engineering For Beginners. General Information Lecturer, Patricia O’Byrne, office K115A. –
ALICE HLT High Speed Tracking and Vertexing Real-Time 2010 Conference Lisboa, May 25, 2010 Sergey Gorbunov 1,2 1 Frankfurt Institute for Advanced Studies,
To quantitatively test the quality of the spell checker, the program was executed on predefined “test beds” of words for numerous trials, ranging from.
Chiara Zampolli in collaboration with C. Cheshkov, A. Dainese ALICE Offline Week Feb 2009C. Zampolli 1.
Photon reconstruction and calorimeter software Mikhail Prokudin.
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
The LiC Detector Toy M. Valentan, M. Regler, R. Frühwirth Austrian Academy of Sciences Institute of High Energy Physics, Vienna InputSimulation ReconstructionOutput.
Experience with analysis of TPC data Marian Ivanov.
Python – Part 1 Python Programming Language 1. What is Python? High-level language Interpreted – easy to test and use interactively Object-oriented Open-source.
Real data reconstruction A. De Caro (University and INFN of Salerno) CERN Building 29, December 9th, 2009ALICE TOF General meeting.
Outline 3  PWA overview Computational challenges in Partial Wave Analysis Comparison of new and old PWA software design - performance issues Maciej Swat.
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.
MD – Object Model Domain eSales Checker Presentation Régis Elling 26 th October 2005.
ALICE Simulation Framework Ivana Hrivnacova 1 and Andreas Morsch 2 1 NPI ASCR, Rez, Czech Republic 2 CERN, Geneva, Switzerland For the ALICE Collaboration.
Ground Truth Free Evaluation of Segment Based Maps Rolf Lakaemper Temple University, Philadelphia,PA,USA.
FLUKA dose and fluence simulations for CBM experiment I.Kadenko, O.Bezshyyko, V.Pluyko, V.Shevchenko National Taras Shevchenko University of Kiev.
Track extrapolation to TOF with Kalman filter F. Pierella for the TOF-Offline Group INFN & Bologna University PPR Meeting, January 2003.
STAR Event data storage and management in STAR V. Perevoztchikov Brookhaven National Laboratory,USA.
ALICE Offline Week, CERN, Andrea Dainese 1 Primary vertex with TPC-only tracks Andrea Dainese INFN Legnaro Motivation: TPC stand-alone analyses.
Tracking in High Density Environment
V0 analytical selection Marian Ivanov, Alexander Kalweit.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
TPC QA + experience with the AMORE framework Marian Ivanov, Peter Christiansen + GSI group.
HLT/AliRoot integration C.Cheshkov, P.Hristov 2/06/2005 ALICE Offline Week.
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Status of global tracking and plans for Run2 (for TPC related tasks see Marian’s presentation) 1 R.Shahoyan, 19/03/14.
Track reconstruction in high density environment I.Belikov, P.Hristov, M.Ivanov, T.Kuhr, K.Safarik CERN, Geneva, Switzerland.
Computing for Alice at GSI (Proposal) (Marian Ivanov)
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
Marian Ivanov (New) Root Memory checker. Outlook ● Motivation ● New memory checker – Implementation – User interface – Examples ● AliRoot observations.
Development of the parallel TPC tracking Marian Ivanov CERN.
Quality assurance for TPC. Quality assurance ● Process: ● Detect the problems ● Define, what is the problem ● What do we expect? ● Defined in the TDR.
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
Anders Nielsen Technical University of Denmark, DTU-Aqua Mark Maunder Inter-American Tropical Tuna Commission An Introduction.
STAR Simulation. Status and plans V. Perevoztchikov Brookhaven National Laboratory,USA.
1 Reconstruction tasks R.Shahoyan, 25/06/ Including TRD into track fit (JIRA PWGPP-1))  JIRA PWGPP-2: Code is in the release, need to switch setting.
Calibration algorithm and detector monitoring - TPC Marian Ivanov.
20 October 2005 LCG Generator Services monthly meeting, CERN Validation of GENSER & News on GENSER Alexander Toropin LCG Generator Services monthly meeting.
Thomas Ruf, CERN EP Experience with C++ and ROOT used in the VX Beam Test Thomas Ruf, CERN, EP  Why? Event structure for VX-data rather complex: raw hits.
AliRoot survey: Reconstruction P.Hristov 11/06/2013.
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.
AliRoot survey: Calibration P.Hristov 11/06/2013.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
HYDRA Framework. Setup of software environment Setup of software environment Using the documentation Using the documentation How to compile a program.
CALIBRATION: PREPARATION FOR RUN2 ALICE Offline Week, 25 June 2014 C. Zampolli.
V4-18-Release P. Hristov 21/06/2010.
(New) Root Memory checker
Marian Ivanov, Anar Manafov
New TRD (&TOF) tracking algorithm
TPC status report Marian Ivanov.
Visualization of embedding
Beam Gas Vertex – Beam monitor
ITS “parallel” tracking
Recent performance improvements in ALICE simulation/digitization
v4-18-Release: really the last revision!
ALICE HLT tracking running on GPU
Analysis framework - status
Framework for the acceptance and efficiency corrections
Dtk-tools Benoit Raybaud, Research Software Manager.
Perl Programming Dr Claire Lambert
Presentation transcript:

Statistical feature extraction, calibration and numerical debugging Marian Ivanov

Outlook Motivation Requirements TTreeStream and TTReeSRedirector classes Conclusion

Reconstruction Statistical data analysis Transformation of the space of measurement to the space of physical observables Reconstruction - Iterative process Reconstruction algorithm itself TPC simplified example – raw data-> clusters ->tracks -> V0, kinks Development of the reconstruction algorithms Starting from simplified models (iteration 0) -> { Adding new information’s -> New features extracting -> New parameterizations -> }  towards to MIP algorithms

Feedback Feedback in each iteration step necessary Feedback - (Working not Working) Standard tools – segmentation violation, printf, debugger, memory profiler, memory checker For statistical algorithm - not sufficient Statistical algorithm’s has to be debugged in statistical way  Feedback - in multidimensional space of observables needed Decomposition and localization of the problem resp. observables (efficiency, resolution …) Better integral characteristics as a consequence, feedback iteration 0 Standard tools root - tree player, histograming package, statistical package (extended functionality needed - under development, ROOT, MI) Alice event Display (with extended functionality)

Effective development of reconstruction algorithms. Time for reconstruction algorithm development Reconstruction algorithm coding Test algorithm coding (numerical debugging) Statistical feature extraction (Calibration, alignment) algorithm coding Time consumption for all three algorithms Debugging of the test algorithms Feedback data analysis – Comparisons

Where do we spend time?(0) Time for reconstruction algorithm coding <<1% of development time Coding of other algorithms (numerical debugging, feature extraction- calibration) Implementation of loops over heterogeneous containers AliESDs, TreeKine, TreeHits, TreeTR Current default approach in AliRoot framework – copy – paste Debugging of the test algorithms

Where do we spend time?(1) Statistical analysis (tests and feature extraction – calibration) Data access - Loop over heterogeneous containers (n2, n3 problem) >99 % of time Statistical data analysis

Requirements To speed-up reconstruction algorithm development and calibration process tests and statistical feature extraction- calibration algorithms Reusable (well tested by a group of users) Standardized, supported by the framework Supporting data structures Non complicated data storage Fast (multidimensional) data access Scalable (easy to include new information, observables) Fast and universal query language over data

ROOT framework (0) ROOT framework provides classes with functionality fulfilling our requirements Data structures Non complicated data storage (TTree) Fast (multidimensional) data access TTree, TChain - optimized for sequential mode data acces (random access much slower) Scalable (easy to include new information, observables) Easy to include several branches of information, possible usage of friend TTree’s Fast and universal query language over data TTreePlayer as a powerful query language, object functionality preserved

Root framework (1) Root framework provides classes with functionality fulfilling our requirements Statistical algorithms Standardized, reusable, well tested Histogramming package Statistical package - base algorithms implemented, ongoing development Additional functionalities on top of TTree’s implemented in Alice team (efficiency and resolution calculations) Some very important robust algorithms independently implemented (1-dimensional robust spline fit, multidimensional needed, not implemented)

Numerical debugging (0) The possibilities for undetected errors- bugs in Reconstruction and Monte Carlo algorithms are numerous Complex system leads to complex calculation Errors can be made on many levels Logical understanding of the problem Typing errors in the programs Non consistent data

Numerical debugging (1) The basic principle is to output not only the number we are interested in but also as many other intermediate results as possible, especially those for which we know in advance what answer to expect. Even if we are only interested in the global average of some quantity, print out a dependence to some other interesting quantity. This generally costs little or nothing extra in big calculation, and may give considerable insight into the system being studied or allow a powerful check of correctness of computation. The quantities which we should look will depend on the problem, but general rule is to examine quantities of interest in more dimensions then is required.

TTreeStream & TTreeSRedirector (0) Streamer with basic cin streamer and TTree functionality implemented to speed up software development process Advantages: Data structures defined on the fly. Easy to include new information. TTree functionality Extensively used during development of ITS, TPF, TRD and TOF reconstruction and alignment Code committed to the CVS Examples and test functions: TTreeStrem::Test() and TTreeSRedirector::Test() function in TTreeStream.cxx

TTreeStream & TTreeSRedirector (1) Example: Create the redirector associated with file (testredirector.root) TTreeSRedirector *pmistream= new TTreeSRedirector("testredirector.root"); TTreeSRedirector &mistream = *pmistream; Create the tree with identifier specified by first argument Layout specified by sequence of arguments Tree identifier has to be specified as first argument If the tree and layout was already defined the consistency is checked If the data are consistent fill given tree the name of branch can be specified using strings with = at the the end if string is not specified use automatic convention B0, B1,...Bn mistream<<"TreeIdentifier"<<"i="<<i<<"ch=" <<ch<<"f="<<f<<"po="<<po<<"\n";

TRD real life example (0). AliTRDtracker::FindTracklet(AliTRDtrack *track) { // //algorithm.. If (DebugMode || AlignmentMode) cstream<<"tracklet"<< "track.="<<track<< // track parameters "tany="<<tany<< // tangent of the local track angle "xmean="<<xmean<< // xmean - reference x of tracklet "tilt="<<h01<< // tilt angle "nall="<<nall<< // number of foundable clusters "nfound="<<nfound<< // number of found clusters "clfound="<<clfound<< // total number of found clusters in road "mpads="<<mpads<< // mean number of pads per cluster "plane="<<plane<< // plane number "road="<<road<< // the width of the used road "graph0.="<<&graph0<< // x - y = dy for closest cluster "graph1.="<<&graph1<< // x - y = dy for second closest cluster "graphy.="<<&graphy<< // y position of the track "graphz.="<<&graphz<< // z position of the track "fCl.="<<&array0<< // closest cluster "fCl2.="<<&array1<< // second closest cluster //…….// "angle0="<<angle[0]<< // angle deviation in the iteration number 0 "sangle0="<<sangle[0]<< // sigma of angular deviation in iteration number 0 "angleb="<<angle[bestiter]<< // angle deviation in the best iteration "sangleb="<<sangle[bestiter]<< // sigma of angle deviation in the best iteration // "expectederr="<<expectederr<< // expected error of cluster position "\n"; }

TRD real life example - Analysis (1)..L AliGenInfo.C+.L AliESDComparisonMI.C+.L AliTRDComparison.C+ MakeTree(); Connect MC information with information retrieved during reconstruction (If MC information available) MakeComparison Comp.DrawPoolsY(MC cuts, quality cuts)

TOF real life example (0) Float_t AliTOFtracker::GetLinearDistances(AliTOFtrack * track, AliTOFcluster *cluster, Float_t distances[5]) //algorithm.. If (DebugMode || AlignmnetMode){ cstream<<"Tracks"<< "TOF.="<<track<< "Cx="<<cpos0[0]<< "Cy="<<cpos0[1]<< "Cz="<<cpos0[2]<< "Dist="<<k<< "Dist0="<<distances[0]<< "Dist1="<<distances[1]<< "Dist2="<<distances[2]<< "TDC="<<tdc<< "\n"; }

TRD and TOF real life example To get access to the same information using standard schema Define data structure (~ 1000 lines of code) and place it somewhere Hundreds of data classes needed Global Trees has to be defined at some moment – to make it possible to access them in member function code SetBranchAddress necessary No problems with AliRoot Code Checker and Smell checker Access to rough data is, according to Smell checker, very suspicious (according to the Smell checker- bad design)

Conclusion To speed up and improve reconstruction algorithm standardized tools for numerical debugging and feature extracting have to be implemented as integral part of the framework TTreeStream and TTreeSRedirector implemented as a first attempt of some standardization