Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementation of a streaming database management system on a Blue Gene architecture for measurement data processing. Erik Zeitler Uppsala data base lab.

Similar presentations


Presentation on theme: "Implementation of a streaming database management system on a Blue Gene architecture for measurement data processing. Erik Zeitler Uppsala data base lab."— Presentation transcript:

1 Implementation of a streaming database management system on a Blue Gene architecture for measurement data processing. Erik Zeitler Uppsala data base lab www.it.uu.se/research/group/udbl

2 Looking out into space: Use large radio telescopes!

3 Problem: Size matters We have hit the limit

4 Use many large radio telescopes? Augment the measurements using signal processing  They act together as a HUGE telescope Look in one direction only Expensive…

5 Solution Use a huge amount of small antennas This enables new scientific applications (and challenges) } Broad band Multi direction receivers

6 Scientific applications Re-ionization epoch the 1 st 10 5 years – hydrogen forming Deep Extragalactic Surveys To boldly go… Transient Sources All-sky surveys of –gamma bursts –flare stars –supernovae Ultra High Energy Cosmic Rays Pulsars

7 Antennas, antennas, antennas… Broad band radio receiver 80…300 MHz, 3 dimensions Produces 0.9 Gbps raw data Central site + 20 outstations located within a circular area, diameter 350 km  13  10 3 antennas

8 System overview Antennas Basic beam forming FPGAs Network GbE, 10GbE Central Processing facility Linux clusters, IBM Blue Gene/L Off line analysis PCs, workstations, Blue Gene

9 System overview

10 Central processing tasks FFT Signal correlation Calibration RFI mitigation (noise from human activities) Stratosphere plasma Subtracting known objects Transient analysis Peak detection

11 Computing challenges Multiple incoming data streams 20 Tbps Multiple experiments Complex computations Demand for rapid reconfiguration of computing systems Use case: On-line transient analysis

12 Central processing facilities On line processing Linux cluster (buffering) Light weight BG/L (beam) 6 racks  6144 compute nodes + 96 I/O nodes Off-line processing Linux clusters, SAN, GRID, …

13 Blue Gene Dataflow supercomputer LLNL installation: 64 racks (65536 CPUs)  70 TFLOPS on the size of a tennis court

14 BG/L architecture I/O node: 2x PPC440@700MHz Linux Each I/O node coordinates 64 compute nodes 512 MB RAM Compute node: 2x PPC440@700MHz Single threaded light weight OS Typically: –1 CPU for computation –1 CPU for communication 512 MB RAM

15 User agent

16 UDBL project Implement a very high performance stream database manager based on AmosII DB kernel (http://user.it.uu.se/~udbl/amos/)http://user.it.uu.se/~udbl/amos/ Utilize the BG/L computing environment for scalable data stream queries involving user-defined computations Implement specialized query optimization: Planning BG/L node configuration for given stream queries Re-configuration when interesting phenomena occur

17 This far (after 4 months) Implementing primitives for data ~ Computation Aggregation Communication Fusion Proof of concept cases Signal processing Peak detection Stream join Benchmark Based on real LOFAR/LOIS data Performance analysis for stream databases

18 A simple example gnuplot(peakdetect(vector_elements(wina gg(vector_elements(readlofarvectorfile( "temp.DAT")),256,256))));

19 Other application areas Other space physics research areas projects at IRFU Network traffic analysis Financial (stock market) information Content analysis of streaming media

20 Questions?


Download ppt "Implementation of a streaming database management system on a Blue Gene architecture for measurement data processing. Erik Zeitler Uppsala data base lab."

Similar presentations


Ads by Google