Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anaphe OO Libraries for Data Analysis using C++ and Python

Similar presentations


Presentation on theme: "Anaphe OO Libraries for Data Analysis using C++ and Python"— Presentation transcript:

1 Anaphe OO Libraries for Data Analysis using C++ and Python
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Anaphe OO Libraries for Data Analysis using C++ and Python Andreas Pfeiffer CERN IT/API CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

2 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Outline Motivation AIDA - Abstract Interfaces for Data Analysis Anaphe Components C++ Lizard: Interactive Data Analysis Python Software quality control Summary CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

3 LHC Computing Challenge
4 experiments will create huge amount of data >1 PetaByte/year for each experiment ! 1015 Bytes 1,000 TeraBytes 20,000 Redwood tapes 100,000 dual-sided DVD-RAM disks 1,500,000 sets of the Encyclopaedia Britannica (w/o photos) Need lots of CPU power to reconstruct/analyse about 1000 PC boxes per experiment (2005 ones !) of today’s boxes (dual P-III 800 MHz) complex data models reconstruction s/w is also used for online filtering needs high quality s/w in order not to waste beam time CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

4 Lifetime of LHC software = 25 yrs
WWW SPS 1969 LEP 1989 W and Z 1983 LEP ends 2000 XML 1.0 1997 Linux V 0.01 1991 C++ 1985 Ethernet standard IBM PC 1981 K&R C 1978 Unix V6 first public version 1975 Java 1995 Intel Pentium 1992 CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

5 Technology (R)Evolution
10 yrs major cycle length (HW,SW,OS) ~12 evolutionary changes in the market 1 revolutionary change towards greater diversity don’t forget changes of requirements Consequences s/w written today most probably will be rewritten tomorrow we must anticipate changes CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

6 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Anaphe: what it is Modular (OO/C++) replacement of CERNLIB functionality for use in HEP experiments memory management I/O foundation classes histogramming minimizing/fitting visualization interactive data analysis Trying to use standards wherever possible Trying to re-use existing class libraries This talk will not cover detector simulation (GEANT-4) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

7 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Anaphe Components CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

8 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 AIDA Abstract Interfaces for Data Analysis CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

9 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
The AIDA project AIDA project (Abstract Interfaces for Data Analysis) was initiated at the HepVis’99 workshop in Orsay Presently active mainly developers from existing packages Tony Johnson (JAS) Andreas Pfeiffer (Lizard/Anaphe) Guy Barrand (OpenScientist ) Mark Dönszelmann (Wired) Developers from LHCb/Gaudi CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

10 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Abstract Interfaces Abstract Interfaces only pure virtual methods, inheritance only from other A.I. components use other components only through their A.I. defines a kind of a “protocol” for a component Maximize flexibility and re-use of packages allow each component to develop independently re-use of existing packages to implement components reduces start-up time significantly De-couple implementation of a component from its use CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

11 Architectural issue: Components (I)
Identify components by functionality Define “protocol” using Abstract Interfaces Emphasize separation of different aspects for each component Example: Histogram statistical entity (density distribution of a physics quantity) view of a “collection of data points” (which can be a density distribution but also a detector efficiency curve) command to manipulate/store/plot/fit/... “User’s view” is different from “implementor’s (developer’s) view” separate Abstract Interfaces for both aspects CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

12 Use of Components with Abstract Interfaces
User Code uses only Interface classes IHistogram1D * hist = histoFactory-> create1D(‘track quality’, 100, 0., 10.) Actual implementations are selected at run-time loading of shared libraries No change at all to user code but keep freedom to choose implementation Histo-IF Fitter-IF User Code Fitter- Impl. Y Histo- Impl. 1 Impl. X Histo- Impl. 2 CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

13 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Across the languages JAida : C++ access to Java libs using C++ proxies implementing the C++ Abstract Interfaces to the Java interfaces C++UserCode AIDA-IF C++ JAida AIDA-IF Java Java Lib CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

14 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
XML standards Started with 1D and 2D Histograms aim: easy transfer between applications Will extend to other data types other histos, fits, ntuples, … Comments/contributions welcome ! CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

15 Anaphe -- OO Data Analysis using C++ and Python Anaphe components
10 June 2018 Anaphe components CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

16 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
‘Layered’ Approach Basic functionalities (histograms, fitting, etc.) are available as individual C++ class libraries. Easy replacing one part without throwing away everything Objectivity/DB to provide persistence HepODBMS library (“insulating layer”, “tags”) Histogram library (HTL) Fitting libraries (Gemini, HepFitting) Graphics libraries (Qt, Qplotter) Insulate components through Abstract Interfaces “wrapper” layer to implement Interfaces in terms of existing libs Apply s/w quality control tools code checking, testing CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

17 Anaphe Components: Overview
CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

18 Basic 3D Graphic Libraries
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Basic 3D Graphic Libraries OpenGL (basic graphics) De-facto industry standard for basic 3D graphics Used in CAD/CAE, games, VR, medical imaging OpenInventor (scene mgmt.) OO 3D toolkit for graphics Cubes, polygons, text, materials Cameras, lights, picking 3D viewers/editors,animation Based on OpenGL/MesaGL CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

19 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 2D Graphics libraries Qt multi-platform C++ GUI toolkit C++ class library, not wrapper around C libs superset of Motif and MFC available on Unix and MS Windows no change for developer commercial but with public domain version Qplotter “add-on” functionality for HEP “HIGZ/HPLOT” Qplotter is a C++ package to produce graphic representation of physics data (such as histograms, scatter plots or curves) both on the screen and as PostScript files. The same package could be used to produce simple 2D drawings (e.g. testbeam setup), but does not provide directly full 3D features such as those of OpenInventor/OpenGL. Qplotter is based on the very popular Qt toolkit and can be freely distributed under the GPL scheme. Users in the High Energy Physics community can think of Qplotter as a replacement of packages such as HIGZ and HPLOT , the graphic subsystem of CERNLIB. CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

20 Mathematical Libraries
NAG (Numerical Algorithms Group) C Library Covers a broad range of functionality Linear algebra differential equations quadrature, etc. Special functions of CERNLIB added to Mark-6 release mostly for theory and accelerator Quality assurance extensive testing done by NAG CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

21 CLHEP - foundation classes
HEP foundation class library Random number generators Physics vectors 3- and 4- vectors Geometry Linear algebra System of units more packages recently added will continue to evolve wwwinfo.cern.ch/asd/lhc++/clhep/ CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

22 Histograms: the HTL package
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Histograms: the HTL package Histograms are the basic tool for physics analysis Statistical information of density distributions Histogram Template Library (HTL) design based on C++ templates Modular : separation between sampling and display Extensible : open for user defined binning systems Flexible: support transient/persistent at the same time Open: large use of abstract interfaces recent addition: 3D histograms The Histogram Template Library (HTL) is a C++ class library that provides powerful histogramming functionality. As the name suggests, it exploits the template facility of C++ and is designed to be compact, extensible, modular and performant. As such it only deals with histograms - i.e. binned data - and not unbinned or "Ntuple" data. Furthermore, although simple file-based I/O and "lineprinter" output are supported, it is decoupled from more advanced I/O and visualisation techniques. In the context of LHC++, such capabilities are provided by other components that fully interoperate with HTL. HTL comes in two flavours: Persistent HTL: based on Objectivity/DB for persistence (requires an Objectivity/DB license) Transient HTL: very simple text file persistence (free) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

23 Fitting and Minimization
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Fitting and Minimization Fitting and Minimization Library (FML) common OO interface NAG-C, MINUIT based on Abstract Interfaces IVector, IModelFunction, … fitting as a special case of minimization minimize “distance” between data and model replacement for HepFitting (and Gemini) Gemini common interface to minimizer engine very thin layer No changes in user code if the minimizer changes. CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

24 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Opening bracket: Persistency CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

25 Object persistency Two concepts: serial and page I/O
“Sequential access to objects” (streaming) good in networking context or serial writes to file(s) much like “good old Fortran” often perceived to be “simpler” to implement (“<<“, “>>”) “Navigational access to objects” (buffered) I/O on demand for complex data models location transparent (for user) access to object typically by de-referencing of a smart pointer optimized for (random) disk access (disks deliver pages) sequential write to file(s) still ok Both concepts need to take care about changes of the internal structure of the objects (schema evolution) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

26 Architectural Issue: Persistency (“Object-I/O”)
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Architectural Issue: Persistency (“Object-I/O”) Brings a completely new quality into the design Objects have now lifetime don’t “delete” until you really are sure you want to persistency is kind of “intended memory leak” would like to see no difference between memory and disk “Layout” of objects may change during (extended) life “schema evolution” additions/deletions of attributes changes of inheritance relations CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

27 Architectural Issue: Persistency (“Object-I/O”) (II)
Objects can be placed (“clustering”) de-coupling of logical and physical view of data Special care needed to ensure consistency in data set avoid reading group of objects (tracks, events,...) for which writing/updating is not (yet) complete clean up if only part of the objects are written typically taken care of by using transactions Complications possible in distributed computing need to protect disk access now like memory access in past (“Segmentation violation”) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

28 Physical Model and Logical Model
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Physical Model and Logical Model Physical model may be changed to optimise performance Existing applications continue to work transparently ! CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

29 Object Model Thanks to Vincenzo Innocente (CMS)
CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

30 Physical clustering Thanks to Vincenzo Innocente (CMS)
CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

31 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Closing bracket: Persistency CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

32 “Tags”, Ntuples and Events
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 “Tags”, Ntuples and Events Tags - a special kind of Ntuple Always associated with an underlying persistent store Tags may be used to store “ntuple-like” data extracted from all over the event minPt, maxEmiss, nJets, nMuon, trigger, … Main use: speedup data selection for analysis … Tag simplifies selection without loosing complexity Events more complex than a tree structure (“CWN”) lots of cross-references between classes, containers Association from the Tag to the Event may be used to navigate to any other part of the Event even from an interactive visualization program CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

33 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Anaphe components CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

34 Anaphe Internals: (Abstract) Interfaces
CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

35 AIDA compliance of Anaphe
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 AIDA compliance of Anaphe Presently (Anaphe 3.x) only AIDA 1.0 compliant Plan to implement AIDA 2.2 Interfaces by end 2001 (Anaphe 4.x) initially as wrappers to existing interfaces/packages Will maintain 3.x for some time ensures stability for users Development will concentrate on 4.x while AIDA will evolve further Similar timeschedule as JAS (Tony Johnson) OpenScientist (Guy Barrand) already there CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

36 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Lizard: a tool for Interactive Data Analysis CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

37 Interactive Data Analysis
Aim: “OO replacement for PAW” (at least) analysis of “ntuple-like data” (“Tags”, “Ntuples”, …) visualisation of data (Histograms, scatter-plot, “Vectors”) fitting of histograms (and other data) access to experiment specific data/code Maximize flexibility and re-use Foresee customization/integration allow use from within experiment’s s/w Plan for extensions “code for now, design for the future” Ensure maintainability use of s/w quality control tools CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

38 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Scripting - why Typical use of scripting is quite different from programming (reconstruction, analysis, ...) history “go back to where I was before” repetition/looping - with “modifiable parameters” avoid “one size fits all” or “using power-tool as hammer” rapid prototyping in “scripting language” quick turn-around times performance critical code in “core language” exploit richer set of features/functionality (e.g. templates in C++) scripting languages usually less susceptible to changes than “mainstream languages” potentially longer lifes CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

39 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Python - why Python - OO (scripting) language no “strange $!%-variables” sensitive to indentation More easy for users as Java Lots of user supplied modules available and ready for use scientific, numerics, graphics, GUI, network, OS, games, DBs, … example: Parnassus Totals: 1173 items in 49 categories. Also usable in Java (Jython) used in JAS for scripting minimize changes needed within AIDA compliant environments CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

40 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Python - how SWIG to (semi-) automatically create connection to chosen scripting language allows flexibility to choose amongst several scripting languages Python, Perl, Tcl, Guile, Ruby, (Java) … Very easy to use swig -c++ -python -shadow -c myClass.h create shared lib from myClass.cpp and myClass_wrap.c start python and import myClass.h to use it Very easy to extend simply inherit from “swiggified” class in python modifications can later be fed back into C++ performance, type safety, special language features (templates), … CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

41 PAW -> Lizard translation
Ntuple projection Lizard lizard --useHBook :-) nt = ntm.findNtuple(“higgscand.hbk::cands”) :-) nplot1D(nt, “mass”, “quality=5 && cut > 198”) Ntuple projection PAW pawX11 paw> h/file 1 higgscand.hbk paw> nt/pl 10.mass quality=5.and.cut>198 Assuming file higgscand.hbk contains ntuple with number 10 and title cands Any valid C++ expression CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

42 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

43 Lizard: History and Present Status
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Lizard: History and Present Status Started after CHEP-2000 Full version out since June 2001 “PAW like” analysis functionality plus: on-demand loading of compiled code using shared libraries gives full access to experiment’s analysis code and data based on Abstract Interfaces flexible and extensible “License free” version since Sep. 2001 HBook for RWNtuples and Histogram storage Minuit as minimizer engine CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

44 Users and Collaborations
AIDA spoken here! IGUANA (CMS visualization) GAUDI (LHCb/HARP) framework ATHENA (Atlas) framework Analyzer modules in Geant 4 JAS Open Scientist …you? CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

45 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Software quality control CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

46 Software quality control
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Software quality control Using tools for testing/checking has started Insure++, CodeWizard Package dependencies: Ignominy Set of perl and shell scripts by Lassi Tuura (CMS) Ignominy scans… Make dependency data produced by the compilers (*.d files) Source code for #includes (resolved against the ones actually seen) Shared library dependencies (“ldd” output) Defined and required symbols (“nm” output) And maps… Source code and binaries into packages #include dependencies into package dependencies Unresolved/defined symbols into package dependencies ignominy: dishonour, disgrace, shame; infamy; the condition of being in disgrace, etc. (Oxford English Dictionary) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

47 Ignominy Analysis of Anaphe
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Ignominy Analysis of Anaphe Distribution of tools and utilities for LHC era physics Combination of commercial, free and HEP software Claims to be a toolkit Seems to live up to its toolkit claims Good work on modularity Clean design is evident in many places Dependency diagrams often split naturally into functional units Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

48 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Package Metrics Size = total amount of source code (not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree A good toolkit’s NCCD will be close to 1.0 < 1.0: structure is flatter than a binary tree (= independent packages) > 1.0: structure is more strongly coupled (vertical or cyclic) Aim: NCCD ~ 1 for given software/functionality Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

49 Metrics: NCCD vs Cycles
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Metrics: NCCD vs Cycles Includes Fortran ATLAS NCCD (“spaghetti index”)  1.0: good toolkit < 1.0: indep. packages > 1.0: strongly-coupled ROOT ORCA G4 COBRA Anaphe IGUANA Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

50 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Future enhancements Access to other implementations of components HBOOK CWNtuples Reading of ROOT (> V3.0) files similar to Tony Johnson’s (Java) RootIO package AIDA Ntuple/Histo store optimized for Ntuples, Histograms as (compressed) XML Communication with Java tools/packages (JAS, Wired) via AIDA Adding other “scripting” languages Perl , Tcl, cint ? CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

51 Challenge: Distributed Computing
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Challenge: Distributed Computing Motivation move code to data parallel analysis Techniques services via AI late binding plug-in architecture End-user (Lizard) look-and-feel of local analysis R&D started and first prototype available soon CORBA based CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

52 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Summary The architecture of Anaphe shows some important items for flexible and modular data analysis: weak coupling between components through use of Abstract Interface basic functionality is covered by individual C++ class libraries emphasis on usability and maintainability Major criteria are flexibility, extensibility and interoperability recent example: GEANT-4 examples (based on AIDA) Lizard is an Interactive Data Analysis Tool based on Anaphe components and the Python scripting language (through SWIG) Lizard is young but has very solid base in mature Anaphe libraries real plug-in structure Software quality control is important tools help to optimize dependencies / minimize maintenance effort CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

53 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
More information cern.ch/Anaphe cern.ch/Anaphe/Lizard aida.freehep.org/ cern.ch/DB wwwinfo.cern.ch/asd/lhc++/clhep/ CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

54 Andreas Pfeiffer, CERN/IT-API, andreas.pfeiffer@cern.ch
Additional slides CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,

55 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Analysis of Geant4 Fairly large C++ project Very fine-grained (and multi-level) package structuring Seems quite clean from the preliminary analysis Fine package subdivision helps in many ways but makes analysis and code understanding more complicated One subsystem seems strongly coupled and needs attention Need to study the use of the internal command system Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

56 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Analysis of ROOT ROOT developers have done a formidable job of breaking binary (shared library) dependencies, but… For example: By static analysis, nothing seems to use the postscript package directly (no incoming dependencies), but there is this code: void TPad::Print (const char *filename, Option_t *option) { […] TVirtualPS *psave = gVirtualPS; if (gROOT->LoadClass("TPostScript","Postscript")) return; gROOT->ProcessLineFast("new TPostScript()"); gVirtualPS->Open(psname,pstype); gVirtualPS->SetBit(kPrintingPS); […] } Taking these and global objects into account makes the dependency diagrams very different Sign of fast growth? Need a “next evolutionary step”? So “coherent” that replacing parts could get painful… Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

57 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Analysis of ROOT… Binary only Binary + Source + Logical = Real Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

58 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Metrics: NCCD vs ACD ATLAS ROOT ORCA G4 COBRA IGUANA Anaphe Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

59 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Metrics: NCCD vs Size ATLAS ROOT ORCA G4 COBRA IGUANA Anaphe Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

60 Anaphe -- OO Data Analysis using C++ and Python
10 June 2018 Metrics: NCCD vs AID ATLAS ROOT ORCA COBRA G4 Anaphe IGUANA Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

61 Metrics: Packages vs Size
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Metrics: Packages vs Size ATLAS ORCA G4 COBRA IGUANA Anaphe ROOT Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

62 Metrics: Packages vs Size
Anaphe -- OO Data Analysis using C++ and Python 10 June 2018 Metrics: Packages vs Size ATLAS ORCA G4 COBRA IGUANA Anaphe ROOT Toolkits & Frameworks Thanks to Lassi Tuura (CMS) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API, Andreas Pfeiffer, CERN/IT-API,

63 Example script (ntuple)
# get list of names of all tuples from tuplemanager ntm.listTuples() nt1=ntm.findNtuple(“Charm1”) # retrieve tuple by name # create 1D histos to project into h1=hm.create1D(10, “mass” ,100, 0., 5000.) h2=hm.create1D(20, “mass for pt1>10” ,100, 0., 5000.) # project the attribute ”MASS" into histo h1 without cut ("") nt1.project1D( h1, “” , “MASS”) # project the attribute ”MASS" into histo h2 with cut (”PT1>10") nt1.project1D( h2, “PT1>10” , “MASS”) CERN 5-Dec-2001 Andreas Pfeiffer, CERN/IT-API,


Download ppt "Anaphe OO Libraries for Data Analysis using C++ and Python"

Similar presentations


Ads by Google