Profiling study of Geant4-GATE VRTs discussion Nicolas Karakatsanis Alexander Howard Geant4 Users Workshop Medical/GATE Parallel Session 13 th September.

Slides:



Advertisements
Similar presentations
Lectures 6&7: Variance Reduction Techniques
Advertisements

Geant4 v9.2p02 Speed up Makoto Asai (SLAC) Geant4 Tutorial Course.
Advanced Neutron Spectrometer (ANS) Geant4 Simulations
Stefan Roesler SC-RP/CERN on behalf of the CERN-SLAC RP Collaboration
From Quark to Jet: A Beautiful Journey Lecture 1 1 iCSC2014, Tyler Dorland, DESY From Quark to Jet: A Beautiful Journey Lecture 1 Beauty Physics, Tracking,
Other GEANT4 capabilities Event biasing Parameterisation (fast simulation) Persistency Parallelisation and integration in a distributed computing environment.
Parameterized Object Sensitivity for Points-to Analysis for Java Presented By: - Anand Bahety Dan Bucatanschi.
Implicit Capture Overview Jane Tinslay, SLAC March 2007.
Summary of Medical Physics Parallel Sessions and Discussions Joseph Perl Stanford Linear Accelerator Center Geant4 Workshop Hebden Bridge, UK 19 September.
Cross Section Biasing & Path Length Biasing Jane Tinslay, SLAC March 2007.
Adapted from G G Daquino, Geant4 Workshop, Catania, 5 Oct 2004 Event Biasing in G4: Why and what? Key User Requirements J. Apostolakis, CERN 19 March 2007.
Bruce Faddegon, UCSF Inder Daftari, UCSF Joseph Perl, SLAC
Alex Howard - Event Biasing Mini-Workshop - SLAC Geometrical Event biasing and Variance Reduction – Talk 2 Alex Howard, CERN Event Biasing Mini-Workshop,
Kirov A S, MSKCC Overview of Geant4 Use and Issues in Imaging: Emission Tomography (PET and SPECT) Assen S. Kirov Department of Medical Physics Memorial.
Highlights of latest developments ESA/ESTEC Makoto Asai (SLAC)
Geant4 Event Biasing Jane Tinslay, SLAC May 2007, Geant4 v8.2.p01.
J. Tinslay 1, B. Faddegon 2, J. Perl 1 and M. Asai 1 (1) Stanford Linear Accelerator Center, Menlo Park, CA, (2) UC San Francisco, San Francisco, CA Verification.
1 Work Plan/Summary I By 25 th May 2007… Class clean-up – biasing and scoring… What to remove? (for v9.0) –Biasing VScorer and associated classes (in transportation.
Interaction Forcing Overview Jane Tinslay, SLAC March 2007.
Leading Particle Biasing Overview Jane Tinslay, SLAC March 2007.
P HI T S Advanced Lecture (II): variance reduction techniques to improve efficiency of calculation Multi-Purpose Particle and Heavy Ion Transport code.
1 Stratified sampling This method involves reducing variance by forcing more order onto the random number stream used as input As the simplest example,
1 Geant4 Physics Based Event Biasing Jane Tinslay, SLAC March 2007, Geant4 v8.2p01.
Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 12 th September Event biasing – An Overview Alex Howard, CERN Event Biasing Overview.
Variance reduction A primer on simplest techniques.
Special Issues on Neutrino Telescopy Apostolos G. Tsirigotis Hellenic Open University School of Science & Technology Particle and Astroparticle Physics.
RF background, analysis of MTA data & implications for MICE Rikard Sandström, Geneva University MICE Collaboration Meeting – Analysis session, October.
S. E. Tzamarias The project is co-funded by the European Social Fund & National Resources EPEAEK-II (PYTHAGORAS) KM3Net Kick-off Meeting, Erlangen-Nuremberg,
Sergey Ananko Saint-Petersburg State University Department of Physics
Other GEANT4 capabilities Event biasing Parameterisation (fast simulation) Scoring Persistency Parallelisation and integration in a distributed computing.
Monte Carlo Comparison of RPCs and Liquid Scintillator R. Ray 5/14/04  RPCs with 1-dimensional readout (generated by RR) and liquid scintillator with.
Geant4 review from the aspect of a GATE developer and user Nicolas Karakatsanis.
Geant4 Event Biasing Marc Verderi, LLR (Heavily copied from Jane Tinslay, SLAC) June 2007.
Optimising Cuts for HLT George Talbot Supervisor: Stewart Martin-Haugh.
Detector Simulation on Modern Processors Vectorization of Physics Models Philippe Canal, Soon Yung Jun (FNAL) John Apostolakis, Mihaly Novak, Sandro Wenzel.
Computing Performance Recommendations #13, #14. Recommendation #13 (1/3) We recommend providing a simple mechanism for users to turn off “irrelevant”
IEEE Nuclear Science Symposium and Medical Imaging Conference Short Course The Geant4 Simulation Toolkit Sunanda Banerjee (Saha Inst. Nucl. Phys., Kolkata,
17-19 Oct, 2007Geant4 Japan Oct, 2007Geant4 Japan Oct, 2007Geant4 Japan 2007 Geant4 Collaboration.
Hadronic Physics II Geant4 Users’ Tutorial CERN February 2010 Gunter Folger.
Alex Howard - Event Biasing Geant4 Users - Lisbon Event biasing and Variance Reduction - Geometrical Alex Howard, CERN Geant4 Users Workshop, Lisbon.
Simulating Differential Dosimetry M. E. Monville1, Z. Kuncic2,3,4, C. Riveros1, P. B.Greer1,5 (1)University of Newcastle, (2) Institute of Medical Physics,
New software library of geometrical primitives for modelling of solids used in Monte Carlo detector simulations Marek Gayer, John Apostolakis, Gabriele.
Spacecraft Environment & Protection Group GEANT4 Workshop, Noordwijk, Sep 1999 Radioactive Decay Process and Data P Truscott and F Lei Space Department.
G4GeneralParticleSource Class: Developed by ESA as the space radiation environment is often quite complex in energy and angular distribution, and requires.
Alex Howard, ETH, Zurich 13 th September 2012, 17 th Collaboration Meeting, Chartres 1 Geometrical Event Biasing Facility Alex Howard ETH, Zurich Geometrical.
Bremsstrahlung Splitting Overview Jane Tinslay, SLAC March 2007.
Profiling study of Geant4-GATE Discussion of suggested VRTs Nicolas Karakatsanis, George Loudos, Arion Chatziioannou Geant4-GATE technical meeting 12 th.
Geant4 CPU performance : an update Geant4 Technical Forum, CERN, 07 November 2007 J.Apostolakis, G.Cooperman, G.Cosmo, V.Ivanchenko, I.Mclaren, T.Nikitina,
Status of BDSIM Simulation L. Nevay, S. Boogert, H. Garcia-Morales, S. Gibson, R. Kwee-Hinzmann, J. Snuverink Royal Holloway, University of London 17 th.
Status of Sirene Maarten de Jong. What?  Sirene is yet another program that simulates the detector response to muons and showers  It uses a general.
P. Rodrigues, A. Trindade, L.Peralta, J. Varela GEANT4 Medical Applications at LIP GEANT4 Workshop, September – 4 October LIP – Lisbon.
Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Event biasing and Variance Reduction Alex Howard, CERN Event Biasing.
Pedro Arce Introducción a GEANT4 1 GAMOS tutorial RadioTherapy Exercises Pedro Arce Dubois CIEMAT
Physics Performance. EM Physics: Observations Two apparently independent EM physics models have led to user confusion: –Different results for identical.
Maria Grazia Pia, INFN Genova and CERN1 Geant4 highlights of relevance for medical physics applications Maria Grazia Pia INFN Genova and CERN.
1 Giuseppe G. Daquino 26 th January 2005 SoFTware Development for Experiments Group Physics Department, CERN Background radiation studies using Geant4.
MONTE CARLO TRANSPORT SIMULATION Panda Computing Week 2012, Torino.
Geant4 Simulation for KM3 Georgios Stavropoulos NESTOR Institute WP2 meeting, Paris December 2008.
MCS overview in radiation therapy
CHEP ’06 GEANT4E 1 GEANT4E: Error propagation for track reconstruction inside the GEANT4 framework Pedro Arce (CIEMAT) CHEP 2006, Mumbai, 13-17th February.
Alex Howard, ETHZ – Biasing Requirements – SLAC 21 st September Biasing Requirements and Interfaces Alex Howard ETH, Zurich Biasing Space User Requirements.
variance reduction techniques to improve efficiency of calculation A
Status of Biasing Development
variance reduction techniques to improve efficiency of calculation B
Data Analysis in Particle Physics
Other GEANT4 capabilities
The Hadrontherapy Geant4 advanced example
G4GeneralParticleSource Class:
Microdosimetric Distributions for a Mini-TEPC due to Photon Radiation
Lesson 4: Application to transport distributions
Presentation transcript:

Profiling study of Geant4-GATE VRTs discussion Nicolas Karakatsanis Alexander Howard Geant4 Users Workshop Medical/GATE Parallel Session 13 th September 2007, Hebden Bridge, UK

Contents Introduction Introduction Simulation efficiency with voxelized phantoms Simulation efficiency with voxelized phantoms Compressed Voxel Parameterization Compressed Voxel Parameterization Profiling study - Conclusions Profiling study - Conclusions Geometry VRTs so far Geometry VRTs so far Discussion of possible new geometry VRTs Discussion of possible new geometry VRTs Suggested actions Suggested actions

Introduction GATE performance degrades when voxelized maps (attenuation or emission distributions) are used GATE performance degrades when voxelized maps (attenuation or emission distributions) are used Simulation times become prohibitive for clinical or pre-clinical simulation studies Simulation times become prohibitive for clinical or pre-clinical simulation studies Profiling studies indicate a performance bottleneck on the Geant4 navigator function Profiling studies indicate a performance bottleneck on the Geant4 navigator function Collaboration between G4 and GATE developers to optimize the efficiency Collaboration between G4 and GATE developers to optimize the efficiency

Simulation efficiency with voxelized phantoms Geant4-GATE features (our analysis) Geant4-GATE features (our analysis) Deal with each voxel as a different material region Deal with each voxel as a different material region At least one step is forced in each material region boundary At least one step is forced in each material region boundary Therefore Therefore G4 stepping implementation can be one of most important (if not the most important) factors that affects simulation speed G4 stepping implementation can be one of most important (if not the most important) factors that affects simulation speed when using large voxelized phantoms when using large voxelized phantoms Possible solution Possible solution If the material in one voxel is the same as that in its neighbouring voxel, If the material in one voxel is the same as that in its neighbouring voxel, Treat those two voxels as one region (no region boundary between those two voxels) Treat those two voxels as one region (no region boundary between those two voxels) Check that condition for all voxels Check that condition for all voxels

Simulation efficiency with voxelized phantoms Possible solution (..continue) Possible solution (..continue) Create an option in stepping function Create an option in stepping function to test if the material in next region is the same as current one. to test if the material in next region is the same as current one. If yes If yes no boundary is applied here and no boundary is applied here and a normal step is taken a normal step is taken A flag can turn on/off the previous option A flag can turn on/off the previous option

Compressed Voxel Parameterization Solution developed in GATE Solution developed in GATE Application of a “compressed” voxel parameterization Application of a “compressed” voxel parameterization generate a phantom where voxel size is variable generate a phantom where voxel size is variable All adjacent voxels of the same material are fused together to form the largest possible rectangular voxel. All adjacent voxels of the same material are fused together to form the largest possible rectangular voxel. Compressed parameterization uses Compressed parameterization uses less memory and also less memory and also less CPU cycles less CPU cycles It is possible to exclude regions in the phantom from being compressed through the use of an "exclude list" of materials It is possible to exclude regions in the phantom from being compressed through the use of an "exclude list" of materials Discussion Discussion Initial purpose of implementation Initial purpose of implementation less memory usage – larger phantoms can be simulated less memory usage – larger phantoms can be simulated Side-effects Side-effects less CPU cycles usage - reduce the simulation time but only by maybe 30% less CPU cycles usage - reduce the simulation time but only by maybe 30% Need for a better speed-up factor Need for a better speed-up factor Investigation of a more efficient way to track particles in a voxelized geometry Investigation of a more efficient way to track particles in a voxelized geometry Contact: Richard Taschereau: Contact: Richard Taschereau:

Profiling studies Profiling studies using GNU tools Profiling studies using GNU tools Callgrind Callgrind Simulation set-up Simulation set-up Biograph6 scanner Biograph6 scanner 1 sec acquisition time 1 sec acquisition time Activity level -> 10kBq Activity level -> 10kBq Compare flat profiles and call graphs of following cases Compare flat profiles and call graphs of following cases NEMA cylindrical analytical phantom NEMA cylindrical analytical phantom NCAT thorax voxelized phantom (using default voxel parameterization) NCAT thorax voxelized phantom (using default voxel parameterization) NCAT thorax voxelized phantom (using compressed voxel parameterization) NCAT thorax voxelized phantom (using compressed voxel parameterization)

Callgrind study analytical NEMA phantom

Callgrind study voxelized NCAT phantom (default parameterization)

Callgrind study voxelized NCAT phantom (compressed parameterization)

Callgrind study Comparison of time-performance cost Total time cost refers to the total time required for the execution of the simulation Absolute cost values are presented with arbitrary units The total cost increases by 814% and 517% when a voxelized phantom and a compressed phantom is used respectively

Callgrind study Comparison of time-performance cost G4EMDataSet::FindValue exhibits one of the highest time inclusive costs (absolute values) among the G4functions which are called by the G4SteppingManager Inclusive time cost is the time spent on the function itself and all of its callees G4LogLogInterpolation::Calculate exhibits one of the highest time self costs (absolute values) among the G4functions which are called by the G4EMDataSet::FindValue Self time cost is the time spent only on the function itself

Callgrind study Comparison between default and compressed parameterization G4ParameterisedNavigation:: LevelLocate exhibits one of the highest time inclusive costs (absolute values) among the functions of the G4Navigator class This function is located at libG4navigator.so and calls most of the functions used by the navigator G4ParameterisedNavigation:: IdentifyAndPlaceSolid determines the parameterization type applied for each voxel Compressed voxel implementation adds up the inclusive time cost of this function, because it adjusts the voxel dimensions depending on its material attribute

Callgrind study - Call graph NEMA analytical phantom G4EMDataSet::FindValue function spends 30.86% of the total cost by completing it’s calls to callees functions and by returning calls made to this function from other caller functions.

Callgrind study - Call graph NCAT voxelized phantom (default parameterization) G4ParameterisedNavigation::LevelLocate function spends 27.88% of the total cost by by completing it’s calls to callees functions and by returning calls made to this function from other caller functions.

Callgrind study - Call graph NCAT voxelized phantom (compressed parameterization) G4ParameterisedNavigation::LevelLocate function spends 31.10% of the total cost by by completing it’s calls to callees functions and by returning calls made to this function from other caller functions.

Profiling study conclusions In the case of voxelized phantoms In the case of voxelized phantoms Significant degradation of the G4-GATE performance Significant degradation of the G4-GATE performance 8 times slower with default parameterization 8 times slower with default parameterization 5 times slower with compressed parameterization 5 times slower with compressed parameterization G4NavigationLevelRep::G4NavigationLevelRep exhibits the highest self time cost (max. 7.44%) G4NavigationLevelRep::G4NavigationLevelRep exhibits the highest self time cost (max. 7.44%) G4ParameterisedNavigation::LevelLocate exhibits the highest inclusive time cost among functions called by the G4steppingManager G4ParameterisedNavigation::LevelLocate exhibits the highest inclusive time cost among functions called by the G4steppingManager Compressed voxel parameterization achieves speed-up of 30-40% only Compressed voxel parameterization achieves speed-up of 30-40% only Need for more efficient version of stepping manager and navigator optimized for G4 medical applications Need for more efficient version of stepping manager and navigator optimized for G4 medical applications In the case of analytical phantoms In the case of analytical phantoms G4EMDataSet::FindValue (located at libG4emlowenergy.so) appears to cause the most important time cost G4EMDataSet::FindValue (located at libG4emlowenergy.so) appears to cause the most important time cost Primer caller of the G4 function with the highest combination of self and inclusive time cost (G4LogLogInterpolation::Calculate) in the case of analytical phantoms Primer caller of the G4 function with the highest combination of self and inclusive time cost (G4LogLogInterpolation::Calculate) in the case of analytical phantoms Discussion between G4 and GATE developer communities Discussion between G4 and GATE developer communities Guidance on which EM options used for different medical applications Guidance on which EM options used for different medical applications

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Event Biasing in Geant4 PHYSICS BASED BIASING: Built in biasing options –Primary particle biasing –Radioactive decay biasing –General hadronic leading particle biasing –Hadronic cross section biasing User defined biasing –G4WrapperProcess See release documentation for detailed instructions on how to implement the various biasing techniques GEOMETRICAL BIASING: Importance Sampling Weight Window

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Primary Particle Biasing Use case: –Increase number of high energy particles in cosmic ray spectrum Increase number of primary particles generated in a particular phase space region of interest –Weight of primary particle modified as appropriate General implementation provided by G4GeneralParticleSource class –Bias position, angular and energy distributions

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September User Defined Biasing: G4WrapperProcess Implement user defined biasing through G4WrapperProcess –A process itself, I.e, inherits from G4VProcess –Wraps an existing process  By default, function calls are forwarded to existing process –Non-invasive way to manipulate the behaviour of a process To use: –Subclass G4WrapperProcess and override appropriate methods, e.g, PostStepDoIt –Register sublcass with process manager in place of existing process –Register existing process with G4WrapperProcess

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Splitting factor = 100No splitting Scoring Geometry Uniform Bremsstrahlung Splitting

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Geometric Biasing * Importance sampling technique * Weight window technique The purpose of geometry based event biasing is to save computing time by sampling less often the particle histories entering “less important” geometry regions, and more often in more “important” regions.

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Importance sampling technique Importance sampling acts on particles crossing boundaries between “ importance cells ”. The action taken depends on the importance value assigned to the cell. In general, a track is either split or plays Russian roulette at the geometrical boundary depending on the importance value assigned to the cell. I=1I=2 Survival probability (P) is defined by the ratio of importance value. P = I post / I pre The track weight is changed to W/P. X Splitting a track ( P > 1 )  E.g. creating two particles with half the ‘ weight ’ if it moves into volume with double importance value. W=1 W=0.5 Russian-roulette (P < 1 ) in opposite direction  E.g. Kill particles according to the survival probability (1 - P). W=0.5 W=1 P = 0.5 P = 2

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September The Weight Window Technique The weight window technique is a weight-based algorithm – generally used together with other techniques as an alternative to importance sampling: –It applies splitting and Russian roulette depending on space (cells) and energy –User defines weight windows in contrast to defining importance values as in importance sampling A weight window may be specified for every cell and for several energy regions: space-energy cell. Apply in combination with other techniques such as cross-section biasing, leading particle and implicit capture, or combinations of these. Apply in combination with other techniques such as cross-section biasing, leading particle and implicit capture, or combinations of these. Upper Energy Lower weight E Lower Weight D C B A Upper Energy Lower weight Survival weight Upper weight Kill/Survive Split

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September The weight window technique (continue) Checks the particle weight –Compare the particle weight with a ‘ window ’ of weights defined for the current energy-space cell –Play splitting or roulette in case if it is outside, resulting in 0 or more particles ‘ inside ’ the window  E.g. WL is a lower weight bound of a cell. CU and CS are upper limit factor and survival factor, respectively. W > WL*CU Split track W < WL*WL Roulette WL WL*CS WL*CU P = W / (WL*CS)

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Coupled Transportation Since release 8.2 coupled transportation has been included This is a generic form of parallel navigation Geometrical biasing was migrated/copied to this formalism –No longer duplicate mass and parallel classes/samplers –Switch between mass and parallel world is through transportation assignation –Examples also migrated – release 9.0 Examples also run for charged particles (and should handle magnetic fields)

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Scoring/Tallies – brief overview

Alex Howard, CERN - Event Biasing Overview – Hebden Bridge 13 th September Physics biasing –Existing physics based biasing fragmented –Identify missing biasing methods & variations between methods in other Monte Carlo codes  Implicit capture  General cross section biasing  Interaction forcing  Path length biasing  Advanced bremsstrahlung splitting  Leading particle biasing –Look at developing dedicated framework to provide general physics biasing in analogy with geometrical biasing  Manipulating physics processes/lists

VRTs for geometry so far Variance Reduction Techniques (VRTs) Variance Reduction Techniques (VRTs) Geometrical Event Biasing Geometrical Event Biasing Importance sampling Importance sampling Photon splitting Photon splitting Russian roulette Russian roulette Weight window and energy sampling Weight window and energy sampling A number of VRT implementations are already available in Geant4, but A number of VRT implementations are already available in Geant4, but Some of the Geant4 classes can be customized more Some of the Geant4 classes can be customized more Make efficient use of the toolikit architecture of G4 Make efficient use of the toolikit architecture of G4 Design of a special navigator customized for voxel geometries Design of a special navigator customized for voxel geometries Further classes implementing VRTs could be developed Further classes implementing VRTs could be developed Improve of the geometry definition transported from GATE to G4 Improve of the geometry definition transported from GATE to G4

Discussion of possible new geometry VRTs Alternative photon splitting technique for G4 Alternative photon splitting technique for G4 If the first photon of the annihilation pair is detected If the first photon of the annihilation pair is detected => second photon splits into multiple photons with equal weights and total weight sum of 1 => second photon splits into multiple photons with equal weights and total weight sum of 1 Degree of splitting depends on the probability of detection Degree of splitting depends on the probability of detection Probability is dependent on axial position and emission angle Probability is dependent on axial position and emission angle Therefore geometrical importance sampling is based on axial position and emission angle Therefore geometrical importance sampling is based on axial position and emission angle Therefore prior knowledge of the geometry of the detector systems is required Therefore prior knowledge of the geometry of the detector systems is required High detection probability => photon splits into a small number of secondary photons High detection probability => photon splits into a small number of secondary photons Low detection probability => photon splits into a large number of secondary photons Low detection probability => photon splits into a large number of secondary photons Speed up of the technique (applied in published MCs for scatter correction – Holdsworth et al ) Speed up of the technique (applied in published MCs for scatter correction – Holdsworth et al ) 3 – 4 times increase in efficiency 3 – 4 times increase in efficiency Additional photon transport algorithms could be implemented Additional photon transport algorithms could be implemented Delta scattering including energy discrimination Delta scattering including energy discrimination Flags implementation Flags implementation user can easily activate or not the various VRT options depending on the requirements of its application user can easily activate or not the various VRT options depending on the requirements of its application

Suggested actions Geant4/GATE collaboration targets Geant4/GATE collaboration targets Guidance on which EM parameters should be used for different medical applications (J. Perl’s suggestion) Guidance on which EM parameters should be used for different medical applications (J. Perl’s suggestion) physics processes, range cuts, step sizes physics processes, range cuts, step sizes Profiling using these guidances and suitable initialization Profiling using these guidances and suitable initialization Use of profiling results for determination of the G4 classes need to be optimized Use of profiling results for determination of the G4 classes need to be optimized Customization of G4Navigator for voxel geometries Customization of G4Navigator for voxel geometries Validation of the efficiency gain achieved Validation of the efficiency gain achieved is the speed/accuracy trade-off linear? is the speed/accuracy trade-off linear? Publication of the new customized libraries Publication of the new customized libraries