Presentation is loading. Please wait.

Presentation is loading. Please wait.

A task-based implementation for GeantV

Similar presentations


Presentation on theme: "A task-based implementation for GeantV"— Presentation transcript:

1 A task-based implementation for GeantV
Joel Fuentes Andrei Gheata    

2 Introduction GeantV is a project that aims at developing a high performance detector simulation system integrating fast and full simulation. This work corresponds to an implementation of a task approach for GeantV using Intel Threading Building Blocks (TBB). Intel TBB is a well-known library that provides different tools to manage tasks, concurrent data structures and parallel algorithms such as parallel for, pipeline, etc. Previous implementation with TBB helped as a starting point.

3 Intel Threading Building Blocks (TBB)
• Each thread has its own ready pool, which is a lists of tasks. • A task goes into each pool when it is allocated. • Each thread steals tasks from other pools when necessary.

4 TBB Task Scheduler Problem Solution Oversubscription Fair scheduling
High overhead Load imbalance Scalability One TBB thread per hardware thread Non-preemptive unfair scheduling Programmer specifies tasks, not threads Work-stealing balances load Specify tasks and how to create them, rather than threads

5 task model of GeantV spawn Feeder task spawn Transport task
Transport task may be further split into subtasks spawn Feeder task Reads from file a number of events. Invokes the concurrent basketizer service Transport task Transports one basket for one step spawn reuse tracks keeping locality output transported tracks inject particle input dequeue basket spawn Basketizer(s) concurrent service injects full baskets enqueue basket Basket queue concurrent service Scoring This is a user task reading track info and creating ”hits” event finished? command dump all your baskets inspect I/O task Write data (hits, digits, kinematics) on disk Garbage collector Forces partially filled baskets into the basket queue to boost concurrency Flow control task event finished? queue empty? spawn event finished? queue empty? spawn Digitizer task This is a user task working on “hits” data

6 Implementation Tasks implemented: InitialTask FlowControllerTask
FeederTask TransportTask Tasks are described using C++ classes that contain the class tbb:task as the base class Task operations are implemented in virtual method execute() New parallel tasks are launched using the spawn(task *t) Once a task is scheduled for execution by the runtime TBB library, the execute() method of the task is called in a non-preemptive manner, completing the execution of the task. Additional classes implemented: ThreadData TaskMgrTBB All the new classes that represent the task-based approach were packed on a shared library called Geant_tbb.

7 How to run in TBB mode? - Install TBB. See for details. - Build the GeantV project for TBB mode by setting the parameters: $ -DUSE_TBB=ON -DTBBROOT=/your/path/to/TBB - Execute runApp in TBB mode by setting the flag: $ ./runApp -i 1

8 Experimental Results Static Threads mode TBB mode
Model name: Intel(R) Xeon(R) CPU E GHz CPU(s): 32 On-line CPU(s) list: Thread(s) per core: 2 NUMA node0 CPU(s): ,16-23 NUMA node1 CPU(s): ,24-31 Static Threads mode runApp -e 10 -u 4 -i 1 -t 16 === Transported: 4870 primaries/ tracks, total steps: , snext calls: , phys steps: , mag. field steps: 0, small steps: 939 bdr. crossings: RT= s, CP=31.08s TBB mode runApp -e 10 -u 4 -t 16 === Transported: 4870 primaries/ tracks, total steps: , snext calls: , phys steps: , mag. field steps: 0, small steps: 916 bdr. crossings: RT= s, CP=54.26s nthreads=16 speed-up= efficiency=

9 Experimental Results Static Threads mode TBB mode
Model name: Intel(R) Xeon(R) CPU E GHz CPU(s): 32 On-line CPU(s) list: Thread(s) per core: 2 NUMA node0 CPU(s): ,16-23 NUMA node1 CPU(s): ,24-31 Static Threads mode runApp -e 10 -u 4 -t 16 === Transported: 4870 primaries/ tracks, total steps: , snext calls: , phys steps: , mag. field steps: 0, small steps: 939 bdr. crossings: RT= s, CP=31.08s TBB mode runApp -e 10 -u 4 -i 1 -t 16 === Transported: 4870 primaries/ tracks, total steps: , snext calls: , phys steps: , mag. field steps: 0, small steps: 916 bdr. crossings: RT= s, CP=54.26s nthreads=16 speed-up= efficiency=

10 Conclusions A first implementation of a task-based approach for GeantV using TBB was deployed. The new approach provides more flexibility and connectivity to other task based multi-threaded frameworks. Some overhead for the TBB mode compares to the static thread approach has still to be fully understood and addressed. possible reason: task initialization There are still more tasks to implement: Scoring Task, I/O Task, Digitizer Task, and so on. Directory with the Task-based implementation: Report about this work submitted to GSoC16:

11 Thank you


Download ppt "A task-based implementation for GeantV"

Similar presentations


Ads by Google