Presentation is loading. Please wait.

Presentation is loading. Please wait.

Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes.

Similar presentations


Presentation on theme: "Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes."— Presentation transcript:

1 Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes

2 Outline: Baseband Instrumentation MultiBOB MWA survey vs PKSMB survey Data rates CPU times Low-Frequency Pulsar Monitoring The Future Supercomputers

3 Pulsar “Dedispersion” Incoherent

4 Coherent Dedispersion Unresolved on us timescales From young or millisecond pulsars Power-law distribution of energies PSR J

5 Pulsar Timing (Kramer et al.)

6 CPSR2 Timing (Hotan, Bailes & Ord)

7 Swinburne Baseband Recorders etc 1998: Canadian S2 to computer (16 MHz x 2)  100K system + video tapes 2000: CPSR  20 MHz x 2 + DLT7000 drives x : CPSR2  128 MHz x 2 + real-time supercomputer (60 cores) 2006: DiFX (Deller, Tingay, Bailes & West)  Software Correlator (ATNF adopted) 2007: APSR  1024 MHz x 2 + real-time supercomputer (160 cores) 2008: MultiBOB  13 x 1024 ch x 64us + fibre core supercomputer

8 dspsr software Mature Delivers < 100 ns timing on selected pulsars Total power estimation every 8us with RFI excision Write a “loader” Can do:  Giant pulse work  Pulsar searching (coherent filterbanks)  Pulsar timing/polarimetry  Interferometry with pulsar gating

9 PSRDADA (van Straten) psrdada.sourceforge.net Generic UDP data capture system (APSR/MultiBOB) Ring Buffer(s)  Can attach threads to fold/dedisperse etc  Hierachical buffers  Shares available CPU resources/disk  Web-based control/monitoring Free! + hooks to dspsr & psrchive.

10 APSR Takes 8 Gb/s voltages Forms:  16 x 128 channels (with coherent dedispersion)  4 Stokes, umpteen pulsars  Real-time fold to DM=250 pc/cc. O(100) Ops/sample  Sustaining >>100 Gflops ~100K computers. June MHz 4bits 768 MHz 2bits

11 Coherent Dedispersion BW/time x x x x (100K) (300K) BW year

12 Coherent Dedispersion Now “trivial” FFT ease ~ B -2 / 3

13 MultiBOB High Resolution Universe Survey (PALFA of the South) Werthimer’s iBOB boards  1024 channels, down to 10us sampling  Two pols FPGA coding hard…  Use software gain equalizer/summer ~5 MB/s beam 1 Gb/s Fibre to Swinburne (>1000 km fibre) Real time searching!

14 New PKS MB Survey: Bailes 13 beams 9 minutes/pointing 1024 channels 300 MHz BW 64 us sampling +/- 15 deg Kramer 13 beams 70 minutes/pointing 1024 channels 300 MHz BW 64 us sampling +/- 3.5 deg Johnston 13 beams 4.5 minutes/pointing 1024 channels 300 MHz BW 32 us sampling The rest

15 MWA Samples  Takes (24x1.3MHz=32 MHz) x 2 x 512  “Just” 32 GB/s (64 Gsamples/s) FFTs it  (5 N log2 ops/pt = 2.2 Tflops) XMultiplies & adds  (512)*256*B*4 = 16 TMACs

16 Sensitivity: ~3-5x PKS 32 vs 288 MHz 350 vs 25 K 700 vs 0.6 deg 2 (folded factor)

17 PKS vs MWA G ~ 3-5 x better T sys ~ 14 x worse ? B 1/2 ~ 3 x worse Flux ~ 25 x better (1400 vs 200 MHz) t 1/2 ~ 32 x better ~ Parity Single Pulse work ~ Comparable Coherent search ~ 32x improvement! But: There is a limit to the time you can observe a pulsar! 4m vs 144m -> 5x deeper.

18 Scattering b=0 1,10,100,1000ms

19 Scattering b=5d 1,10,50,100ms

20 b=30 0.5,1ms

21 36 GB/s Search instrumentation? 32 MHz... FX GB/s 5 bits x 512 Grid... 2D FFT Volts SpectraVisibilities uv FBanks Dedisp... Spectra FFT Fold Pulsars <1 bit/s 200 GB/s 32 bits x GB/s 32 bits x 512 x GB/s x GB/s Correlator Us ? ?

22 Search Timings 36,000 “coherent beams” (768m/4m=192) 2 36 gigapixels/s Dedisperse/CPU core  Gigapixel/120s  36 x 120 = 4320 cores = 500 machines = 250 kW N FFT = 36,000 * 1024 (DMs)/8192 = 4608 FFTs/sec Seek (3s / 8192 x 1024 pt FFT)  14,000 cores ~ 1800 machines = MW. (M$/yr)

23 Swinburne The Green Machine  installed May/June 2007  185 Dell PowerEdge1950 nodes  2 quad-core processors (Clovertown: Intel Xeon 64-bit 2.33 GHz)  16GB RAM  1TB disk -> 300 TB total  1640 cores/14 Tflops  dual channel gigabit ethernet  CentOS Linux OS  job queue submission  20 Gb infiniband (Q1 2008)  83 kW.vs. 130 kW cooling Machines: ~1.2M Fuel: ~100K/yr

24 Search Times: Depend only upon:  Npixels x Nchans x Tsamp -1 Requires:  No acceleration trials PSR J  In 8192s, small width from acceleration

25 Search Timings (32x32 tiles) >1024 “coherent beams” 36->1 gigapixels/s Dedisperse/core  Gigapixel/120s  120 = 120 cores = 15 machines = 7 kW N FFT = 1024 * 1024 (DMs)/8192(s/FFT) = 128 FFTs/sec Seek (3s / (8192 x 1024) pt FFT)  378 cores ~ 50 machines = 25 kW.

26 RRATs Log N - Log S (helps with long pointings…) 1000 x integration time. Maybe good RRAT finder.

27 Monitoring: Monitoring?

28 Monitoring:

29 Build Your Own Telescope? May be cheaper to build dedicated PSR telescope than attempt to process everything from existing telescopes! 32x32 tile: (2D FFT - 1D FFT - dedisperse - FFT)  ~2M telescopes  ~2M “beamformer/receivers”  ~1M correlator  ~1M Supercomputer  ~1M construction  ~7-8M

30 Next-Gen Supercomputers (IO or Tflops?) Infiniband 20 Gb (40Gb)  288 port switch  ~10 Tb/s IO Capacity (1-2K/node) Teraflop CPU capacities/node (140 Gflops now) Teraflop Server or Tflop GPU?  10 GB/s vs 76 GB/s Power (0.1W/$)  2M = 200 kW

31 Architecture (2011??): 288 Ports 40 Gb/s 288 Ports 40 Gb/s 144 Tflops 300K ~1M FX

32 Summary: Strong motivation for multiple (~100) tied array beams  PSRs/deg^2 Surveys only possible with compact configurations  At present Future Supercomputers may allow search even with MWA-like telescopes


Download ppt "Low-Frequency Pulsar Surveys and Supercomputing Matthew Bailes."

Similar presentations


Ads by Google