Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney.

Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney

ADASS2007; GvD 24-9-2007 Contents Introduction Data Distribution Architecture Performance issues Current status and future work

ADASS2007; GvD 24-9-2007 Data Volume in future telescopes LOFAR 37 stations (666 baselines) grows to 63 stations (1953 baselines) 128 subbands of 256 channels (32768 channels) 666*32768*4*8 bytes/sec = 700 MByte/sec 5 hour observation gives 12 TBytes ASKAP (spectral line observation) 45 stations (990 baselines) 32 beams, 16384 channels each 990*32*16384*4*8 bytes/10 sec = 1.6 GByte/sec 12 hour observation gives 72 Tbytes One day observing > entire world radio archive ASKAP continuum: 280 GBytes (64 channels) MeerKAT similar to ASKAP

ADASS2007; GvD 24-9-2007 Key Issues TraditionallyFuture Size Few GBytes Several TBytes Processing time weeks-months < 1 day ModeInteractively Automatic pipeline Archived?Always Some data WhereDesktop Dedicated machine IO Many passes through data Only few passes possible Package used AIPS,Miriad,Casa,..?

ADASS2007; GvD 24-9-2007 Data Distribution Visibility data need to be stored in a distributed way Limited use for parallel IO Too many data to share across network Bring processes to the data NOT bring data to processes

ADASS2007; GvD 24-9-2007 Data Distribution Distribution must be efficient for all purposes (flagging, calibration, imaging, deconvolution) Process locally where possible and exchange as few data as possible Loss of a data partition should not be too painful Spectral partitioning seems best candidate

ADASS2007; GvD 24-9-2007 Architecture Connection types: SocketMPIMemoryDB

ADASS2007; GvD 24-9-2007 Data Processing A series of steps have to be performed on the data (solve, subtract, correct, image,...) Master get steps from control process (e.g. Python) If possible, step is directly sent to appropriate workers Some steps (e.g. solve) need iteration Substeps are sent to workers Replies are received and forwarded to other workers

ADASS2007; GvD 24-9-2007 Calibration Processing Solving non-linearly do { 1: get normal equations 1: get normal equations 2: send eq to solver 2: send eq to solver 3: get solution 3: get solution 4: send solution 4: send solution } while (!converged)

ADASS2007; GvD 24-9-2007 Performance: IO Distributed IO, yet 24 minutes to read 72 TByte once IO should be asynchronous to avoid idle CPU Deployment decision what storage to use Local disks (RAID) SAN or NAS Sufficient IO-bandwidth to all machines is needed Calibration and imaging are used repeatedly, so the data will be accessed multiple times BUT operate on chunks of data (work domain) to keep data in memory while performing many steps on them Possibly store in multiple resolutions Tiling for efficient IO if different access patterns

ADASS2007; GvD 24-9-2007 Performance: Network Process locally where possible Send as few data as possible (normal equations are small matrices) Overlay operations e.g. Form normal equations for next work domain while Solver solves current work domain

ADASS2007; GvD 24-9-2007 Performance: CPU Parallelisation (OpenMP,...) Vectorisation (SSE instructions) Keep data in CPU cache as much as possible, so smallish data arrays Optimal layout of data structures Keep intermediate results if not changing Reduce number of operations by reducing the resolution

ADASS2007; GvD 24-9-2007 Current status Basic framework has been implemented and is used in LOFAR and CONRAD calibration and imaging Can be deployed on cluster or super (or desktop) Tested on SUN cluster, Cray XT3, IBM PC cluster, MacBook Resource DB describes cluster layout and data partitioning. Hence the master can derive which processor should process with part of the data.

ADASS2007; GvD 24-9-2007 Parallel processed image (Tim Cornwell) Runs on ATNF’s Sun cluster “minicp” 8 nodes Each node = 2 * dual core Opterons, 1TB, 12GB Also on CRAY XT3 at WASP (Perth, WA) Data simulated using AIPS++ Imaged using CONRAD synthesis software New software using casacore Running under OpenMPI Long integration continuum image 8 hours integration 128 channels over 300MHz Single beam Use 1, 2, 4, 8, 16 processing nodes for calculation of residual images Scales well Must scale up hundred fold Or more….

ADASS2007; GvD 24-9-2007 Future work More work needed on robustness Discard partition when processor or disk fails Move to other processor if possible (e.g. replicated) Store data in multiple resolutions? Use master-worker in flagging, deconvolution Worker can use accelerators like GPGPU, FPGA, Cell (maybe through RapidMind) Worker can be a master itself to make use of BG/L in a PC cluster

ADASS2007; GvD 24-9-2007 Future work Extend to image processing (few TBytes) Source finding AnalysisDisplay VO access?

ADASS2007; GvD 24-9-2007 Thank you Joint work with people at ASTRON, ATNF, and KAT More detail in next talk about LOFAR calibration See poster about CONRAD software Ger van Diepen diepen@astron.nl

Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney.

Similar presentations

Presentation on theme: "Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney.

Similar presentations

Presentation on theme: "Distributed Processing of Future Radio Astronomical Observations Ger van Diepen ASTRON, Dwingeloo ATNF, Sydney."— Presentation transcript:

Similar presentations

About project

Feedback