Presentation is loading. Please wait.

Presentation is loading. Please wait.

B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo

Similar presentations


Presentation on theme: "B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo"— Presentation transcript:

1 B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo
STATISTICAL TOOLKIT B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo Geant4 Workshop Catania, October 4th-9th 2004

2 Goodness-of-Fit testing
Provide tools for the statistical comparison of distributions equivalent reference distributions experimental measurements data from reference sources functions deriving from theoretical calculations or fits Qualitative evaluation Quantitative evaluation A project to develop a statistical comparison system Comparison of distributions Goodness of fit testing Detector monitoring Simulation validation Reconstruction vs. expectation Regression testing Physics analysis Detector monitoring in order to check if the behavior is constant in more than one run

3 Architectural guidelines
The project adopts a solid architectural approach to offer the functionality and the quality needed by the users to be maintainable over a large time scale to be extensible, to accommodate future evolutions of the requirements Component-based approach to facilitate re-use and integration in different frameworks AIDA adopt a (HEP) standard no dependence on any specific analysis tool

4 Software process guidelines
United Software Development Process, specifically tailored to the project practical guidance and tools from the RUP both rigorous and lightweight mapping onto ISO 15504 Guidance from ISO 15504 Incremental and iterative life cycle model SPIRAL APPROACH

5 Requirement traceability
User Requirements User requirements elicited, analysed and formally specified Functional (capability) and not-functional (constraint) requirements User Requirements Document available from the web site Requirements Design Implementation Test & test results Documentation Requirement traceability

6

7 It is externally distributed with PI
The algorithms are specialised on the kind of distribution (binned/unbinned) Every algorithm has been rigorously tested! The Toolkit is downloadable from the web: It is externally distributed with PI

8 Chi-squared test Applies to binned distributions
It can be useful also in case of unbinned distributions, but the data must be grouped into classes Cannot be applied if the counting of the theoretical frequencies in each class is < 5 When this is not the case, one could try to unify contiguous classes until the minimum theoretical frequency is reached Otherwise one could use Yates’ formula

9 More sophisticated algorithms
unbinned distributions Kolmogorov-Smirnov test Goodman approximation of KS test Kuiper test EMPIRICAL DISTRIBUTION FUNCTION ORIGINAL DISTRIBUTIONS Dmn SUPREMUM STATISTICS

10 More powerful algorithms
unbinned distributions Cramer-von Mises test Anderson-Darling test TESTS CONTAINING A WEIGHTING FUNCTION These algorithms are so powerful that we decided to implement their equivalent in case of binned distributions: Fisz-Cramer-von Mises test k-sample Anderson-Darling test

11 2 Is 2 the most powerful algorithm? In terms of power:
The power of a test is the probability of rejecting the null hypothesis correctly In terms of power: 2 Supremum statistics tests Tests containing a weight function < Test Power Characteristics Anderson-Darling High Sensitive to tails c2 Low General Fisz-Cramer-von Mises Symmetric, right-skewed distributions Goodman Medium Approximation of K-S to c2 test statistics Kolmogorov-Smirnov Derives from Kolmogorov statistics Kuiper Sensitive to tails and median Tiku Converts CvM statistics to a c2 Talk at IEEE NSS, Rome, October paper submitted for publication November 2004

12

13 Feedback from users is welcome!
GPL License Feedback from users is welcome!

14 User Documentation Download Installation User Guide
Statistics Reference Guide User Documentation

15 EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE
User’s point of view Simple user layer Only deal with AIDA objects and choice of comparison algorithm The user is completely shielded from both statistical and computing complexity. STATISTICAL RESULT TOOLKIT USER EXTRACTS THE ALGORITHM WRITING ONE LINE OF CODE

16 Examples of practical applications

17 THANKS TO SUSANNA GUATELLI
Microscopic validation of physics p=1 NIST Geant4 Standard Geant4 LowE Geant4 simulations are statistically comparable with reference data (NIST database Chi-squared test THANKS TO SUSANNA GUATELLI READY FOR REGRESSION TESTING

18 X-ray fluorescence spectrum in Iceand basalt
Test beam at Bessy Bepi-Colombo mission Energy (keV) Counts X-ray fluorescence spectrum in Iceand basalt (EIN=6.5 keV) Very complex distributions c2 not appropriate (< 5 entries in some bins, physical information would be lost if rebinned) Anderson-Darling p>0.05 THANKS TO ALFONSO MANTERO Experimental measurements are comparable with Geant4 simulations

19 THANKS TO MICHELA PIERGENTILI
Medical physics: IMRT treatment at THANKS TO MICHELA PIERGENTILI Kolmogorov-Smirnov test Distance range Test statistics P-value -84  -60 mm 0.385 0.23 -59  -48 mm 0.27 0.90 -47  47 mm 0.43 0.19 48  59 mm 0.30 0.82 60  84 mm 0.40 0.10 Distance range Test statistics P-value -56  -35 mm 0.26 0.89 -34  -22 mm 0.43 0.42 -21 21 mm 0.38 0.08 22  32 mm 0.98 33  36 mm 0.57 0.13

20 Conclusions Applications in: HEP, astrophysics, medical physics, …
This is a new up-to-date easy to handle and powerful tool for statistical comparison in particle physics. Rigorous software process to contribute to the quality of the product Component-based architecture, OO methods + generic programming to ensure openness to evolution, maintainability, ease of use It the first tool supplying such a variety of sophisticated and powerful statistical tests in HEP. AIDA interfaces allow its integration in any other concrete data analysis tool. Applications in: HEP, astrophysics, medical physics, … WE INVITE ANYONE TO USE IT!!!!

21 Future developments Power comparison among algorithms
Extension to theoretical functions Extensions to bidimensional distributions


Download ppt "B.Mascialino, A.Pfeiffer, M.G.Pia, A.Ribon, P.Viarengo"

Similar presentations


Ads by Google