Distributed Simulation with Geant4

Distributed Simulation with Geant4
Preliminary results of the LowE / DIANE joint project Jakub T. Moœcicki, CERN/IT credits also to: Alfonso Mantero, INFN Genova

History Parallelization of Geant4 simulation is a joint project between Geant4 – DIANE – Anaphe DIANE is an R&D project in IT/API to study distributed analysis and simulation and create a prototype initiated early 2001 with very limited resources Anaphe is an analysis project supported by IT provides the analysis framework for HEP The pilot programme includes G4 simulation which produces AIDA/Anaphe histograms Collaboration started late spring 2002

Sequential Geant4 Simulation
the goal of simulation: optimize the detectors used for x-ray fluorescence emission from Mercury's crust in the context of Hermes, Bepi Colombo ESA mission. requires high statistics è many events 20 Mio events ~ 3 hours up to 100 Mio events might be useful estimated time ~16 hours

Parallel Geant4 Simulation
increase performance shift from batch to semi-interactive simulation speed up the analysis cycle generate more events – debug simulation faster from sequential to parallel simulation preserve reproducability of the results minimize deployment overhead when moving from sequential to parallel simulation both in terms of time and amout of code/expertise one must invest

Performance Increase

Benchmarking environment
parallel cluster configuration lxplus: 70 redhat 61 nodes 7 Intel STL2 (2 x PIII 1GHz, 512MB) 31 ASUS P2B-D ( 2 x PIII 600MHz, 512MB) 15 Celsius 620 (2 x PIII, 550MHz, 512MB) the rest – Kayak 450 Mhz (2 x PIII, 450Mhz, 128MB) reference sequential machine pcgeant2 (2x Xeon 1700Mhz, 1GB)

Benchmarking Caveat non-exclusive access to interactive machines
'load-noise' background, unpredictible load peaks different CPU and RAM on nodes AFS used to fetch physics config data try to remove the noise: repeat simulations many times to get the correct mean work at night and off-peak hours (what about US people using CERN computing facilities ?) etc... conclusion: results should be taken with caution and are approximate

Structure of the simulation
initialization phase (constant) load ~10-15 Mb of physics tables, config data etc. reference sequential machine: ~ 4 minutes (user time) cluster nodes: ~ 5-6 minutes beamOn ~ f( event number ) small job: 1-5 Mio events medium job: Mio events big job: > 50 Mio events

Scalability test (job time)

Normalized efficency

Benchmarking (comments)
results are approximate scaling factors for different CPU speeds but seem with agreement with expectations move from batch to semi interactive simulation feasible small jobs do not gain so much – large constant initialization time

Problems & solutions time of job execution = slowest machine...
...or most loaded one at the moment often had to wait a long time for last worker to finish possible solution: use larger number of smaller workers fast machines run workers sequentially many times, but... constant initialization time rather important initialize once, beamOn many times... to be checked if this problem is solved we may move towards more interactive simulation

From sequential to parallel simulation

Reproducability initial seed of the random engine our solution:
make sure that every parallel simulation starts with a seed uniquely determined by the job's initial seed number of times engine is used depends on the initial seed make sure that correlations between the workers' seeds are avoided our solution: use two uncorrelated random engines one to generate a table of initial seeds (one seed for each worker) another for the simulation inside the worker

Reproducability parameters which need to be fixed to reproduce the simulation: total number of events initial seed ... but also: number of workers number of events per worker

Minimizing deployment overhead

Ease of use user-friendliness as non-intrusive as possible
G4 simulation developer should not need to fight with irrelevant technical problems when moving from sequential to parallel G4 simulation as non-intrusive as possible minimize necessary code changes in original simulation good separation of the subsystems G4 simulation does not need to know that it runs in parallel... the distributed framework (DIANE) does not need to care about what actually is being simulated (see #Slide 20)

What is DIANE? semi-interactive parallel analysis for LHC
R&D project in IT/API semi-interactive parallel analysis for LHC middleware technology evaluation & choice CORBA, MPI, Condor, LSF... also see how to integrate API products with GRID prototyping (focus on ntuple analysis) time scale and resources: Jan 2001: start (< 1 FTE) June 2002: running prototype exists sample Ntuple analysis with Anaphe event-level parallel Geant4 simulation Distributed = Parallel + Remote

What is DIANE? framework for parallel cluster computation
application-oriented master-worker model common in HEP applications application-independent apps dynamically loaded in a plugin style callbacks to applications via abstract interfaces component-based subsystems and services packaged into component libraries core architecture uses CORBA and CCM (CORBA Component Model ) integration layer between applications and the GRID environment and deployment tools Grid-enabled framework for HEP applications this framework will be a Grid component ...via a gateway that understands Grid/JDL framework uses lower level Grid components authentication, security, load balancing distribution aspects parallel cluster computation "institute" or "workgroup" level (Tier 1-3) local computing center remote analysis geographically unlimited

Master/Worker model applications share the same computation model
so also share a big part of the framework code but have different non-functional requirements CPU vs IO intensive semi-interactive vs batch etc....

What DIANE is not DIANE is not
a replacement for a GRID and its services a hardwired analysis toolkit Distributed = Parallel + Remote

DIANE and GRID DIANE as a GRID computing element
...via a gateway that understands Grid/JDL ... Grid/JDL must be able to descibe parallel jobs/tasks DIANE as a user of (low level) Grid services ...authentication, security, load balancing... and profit from existing 3rd party implementations python environment is a rapid prototyping platform and may provide a convinient connection between DIANE and Globus Toolkit via pyGlobus API Parallel part: appropriate hardware infrastructure - RCS is a cluster somehow involved into Grid Remote part: no a priori limitations for the distance between RCS and the user

Architecture Overview
layering: abstract middleware interfaces and components plugin-style application loading

Conclusions prototype deployment of G4-DIANE
significant performance improvement possible scalability tests: 140 Mio Events 70 nodes in the cluster 1 hour total parallel execution putting together DIANE and G4 is fairly easy done in several days... DIANE may bridge G4 to the GRID world without necessarily waiting for fully-fledged GRID infrastructure to become available

Distributed Simulation with Geant4

Similar presentations

Presentation on theme: "Distributed Simulation with Geant4"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Simulation with Geant4

Similar presentations

Presentation on theme: "Distributed Simulation with Geant4"— Presentation transcript:

Similar presentations

About project

Feedback