Presentation is loading. Please wait.

Presentation is loading. Please wait.

Physics data management tools: computational evolutions and benchmarks Mincheol Han 1, Chan-Hyeung Kim 1, Lorenzo Moneta 2, Maria Grazia Pia 3, Hee Seo.

Similar presentations


Presentation on theme: "Physics data management tools: computational evolutions and benchmarks Mincheol Han 1, Chan-Hyeung Kim 1, Lorenzo Moneta 2, Maria Grazia Pia 3, Hee Seo."— Presentation transcript:

1 Physics data management tools: computational evolutions and benchmarks Mincheol Han 1, Chan-Hyeung Kim 1, Lorenzo Moneta 2, Maria Grazia Pia 3, Hee Seo 1 1 Hanyang University, Korea – 2 CERN, Switzerland – 3 INFN Sezione Di Genova, Italy SNA + MC 2010 Joint International Conference on Supercomputing in Nuclear Applications + Monte Carlo 2010

2 Physics data libraries Data libraries Collection of experimental or theoretical tabulations of physics quantities e.g. cross sections, form factors, nuclear and atomic parameters etc. Distributed by data centres: RSICC (ORNL), NEA, NIST… Essential ingredient of Monte Carlo simulation Use established data Speed up simulation w.r.t. using analytical formulae Common background for different Monte Carlo systems ENDF/B, ENSDF, JENDL, CENDL, BROND, EEDL, EPDL, EADL…

3 Dealing with physics data Data management Load (and store) data Retrieve data Use data: directly, by interpolation Loading Usually in the simulation initialization phase Loading on demand Retrieving In the course of the simulation (usually at each step) Can be source of significant overload

4 Original design in Geant4 Composite Pattern Handle different data collections transparently  Data for materials  Data for atoms  Data for shells Strategy Pattern Handle interchangeable interpolation algorithms transparently electromagnetic data (Livermore library)

5 Can we improve it? Geant4 physics on a diet Leaner software design Improve computational performance Enhance clarity and transparency Facilitate testing Ease maintenance physics models CHEP 2009 R&D: physics models M.G. Pia et al., Design and performance evaluations of generic programming techniques in a R&D prototype of Geant4 physics physics data Monte Carlo + CHEP 2010 R&D: physics data Prototype to evaluate candidate solutions quantitatively This talk: selection of preliminary results Final and complete results will be documented in a dedicated publication

6 Test set-up Test case: Livermore library data EEDL (Evaluate Electron Data Library): ionisation, Bremsstrahlung EPDL97 (Evaluated Photon Data Library): Compton and Rayleigh scattering, photoelectric effect, pair and triplet production EADL (Evaluated Atomic Data Library): atomic parameters Computing environment Geant4 9.4-beta + G4EMLOW 6.13 Intel® Core™ Duo CPU E8500 with 3.16 GHz processor, 4 GB RAM, Linux SLC5, gcc 4.3.5 compiler Intel® CPU U4100 with 1.30GHz processor, 2 GB RAM, MS Windows XP SP3, MSVC++9 C++ compiler (with SP1) Load Load test loading data for a number of elements between 1 and 100 each experiment repeated 100 times, the whole series repeated 10 times Retrieve Retrieve test finding the data associated with a randomly chosen atomic number finding procedure repeated 10 6 times, whole experiment repeated 10 times

7 Data structure Improve the physical design of the data library itself Large tabulations split into individual files (one per element) time (ms) to load data vs. number of elements present in the experimental set-up Excitation data original data split data

8 Data structure Large physics tabulations require large memory allocation for storing the data, time to load them into memory and to search trough them Are all the data really necessary? Reduce the amount of data A B C A B C Suppress B if ● can be interpolated with the same accuracy based on A and C time (ms) to load data vs. number of elements present in the experimental set-up Compton scattering functions Number of data for each element reduced original original data reduced data

9 Use forthcoming C++ features Current implementation uses STL map for most data, STL vector for a few data types Evaluated unordered_map (AKA hash map) Included in C++0x TR1 gcc 4.3.x in MSVC Pair production cross sections time (ms) to load data vs. number of elements present in the experimental set-up STL map unordered_map

10 Caching pre-calculated data Recent modification in Geant4 low energy electromagnetic package: cache pre-calculated log 10 data Credit to current Geant4 low energy electromagnetic group Not to be credited to the authors of this talk The authors of this talk Quantified the time for loading/retrieving Quantified the memory consumption to store additional (cached) data Reviewed the modified software design and implementation: flaws ~10% time saving w.r.t. on-the-fly log 10 calculation loading retrieving original modified original modified time (ms) to load and retrieve data vs. number of elements present in the experimental set-up

11 Generic programming techniques polymorphic behavior of data sets and interpolation algorithms is not necessary at runtime through dynamic binding OOAD iteration Preliminary design Templates eliminate the overhead due to the virtual table associated with inheritance Contribution to to improving execution speed

12 Effect of prototype design: loading Rayleigh scattering form factors Bremsstrahlung cross sections The extent of the improvement depends on the characteristics of the data original prototype original prototype time (ms) to load data vs. number of elements present in the experimental set-up Original design: STL vectors, load all elements

13 Effect of prototype design: retrieving Pair production cross sections Bremsstrahlung spectrum data time (ms) to retrieve data vs. number of elements present in the experimental set-up Original design Prototype design Prototype design + unordered_map

14 Use vectors! Some data sets in the original design do not require the use of STL map Can be efficiently managed by using STL vectors Not worthwhile to move them to unordered_map Rayleigh scattering form factors time (ms) to retrieve data vs. number of elements present in the experimental set-up Original design Prototype design (map) Prototype design (unordered_map)

15 Conclusions Prototype R&D on Geant4 physics data management Investigated  Data structure  Software design  Use of C++0x TR1 features Results  Leaner software  Improved performance RD44 1994-1998 Geant4 R&D phase Cutting edge technology Rigorous software development process Geant4 would profit from reenacting a R&D phase to exploit new technology with the same spirit of scientific openness and rigorousness as RD44 Geant4 would profit from reenacting a R&D phase to exploit new technology with the same spirit of scientific openness and rigorousness as RD44 Same conclusions at CHEP 2009 regarding physics modeling Acknowledgment Thanks to CERN Directorate for support


Download ppt "Physics data management tools: computational evolutions and benchmarks Mincheol Han 1, Chan-Hyeung Kim 1, Lorenzo Moneta 2, Maria Grazia Pia 3, Hee Seo."

Similar presentations


Ads by Google