Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))

Similar presentations


Presentation on theme: "Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))"— Presentation transcript:

1 Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))

2  With multicore in mind, what would we do differently in next-gen infrastructure?  Knowing from the outset we will eventually merge files, what would we do differently? -from many jobs -from multiple AthenaMP processes or threads -look at TMemFile and consider its implications  Similar for input, what can be done to support efficient event sharing among many jobs?

3 Multi Event Read Context This is really more about AthenaMP but of course AthenaMP is all about I/O…  Currently athenaMP schedules processing of single events (round robin or queue) to event worker. –OK for bytestream, which can read / decompress a single event (row wise). –Not so good for ROOT data, which collects objects of the same type for compression (column wise). However, since data written with release 17, only a small number of events (5 or 10) get combined into the same basket, and that number is known (depends only on format).  Different scheme needed for reading RDO, ESD, AOD to allow chunks of events that are compressed together to be processed by the same worker. –Expected to speed up reading by 20 – 50%. –Better utilization of memory (no duplicated buffer on different worker)  Small chunk size of only 5 or 10 events should not really upset load balancing.

4 Multi File Processing Again about AthenaMP  Silly:  AthenaMP jobs got a sizeable event throughput penalty, because the initialize() / exexcute() ratio is to high.  So, process more events per job, several input files.  But than: –The output file gets to large That is the merged output file (the worker output will still be small) –No longer the typical 1 input to 1 output file correspondence. May create metadata headache  Create worker groups that share processing of the same in put file and merge output only within these groups. –Preserves 1-1 file correspondence, with well sized output files. –May need load balancing thoughts. File processing times may vary greatly. –Could use number of events to determine number of worker per group (input file).

5 TMemFile  ROOT ways of combining/merging several TTrees to the same output TTree.  Biggest obstacle for ATLAS / APR: Will not return valid entry number (afaik), –so we cannot easily create an external Token –Will mix up event ordering. Not nice, but something we have to live with in MP anyway.  Not clear whether type specific function calls could be sufficient to merge metadata objects  And what about tree synchronization? –We have several trees and it would be nice (not required) to keep them parallel.


Download ppt "Multi Process I/O Peter Van Gemmeren (Argonne National Laboratory (US))"

Similar presentations


Ads by Google