Presentation is loading. Please wait.

Presentation is loading. Please wait.

Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration.

Similar presentations


Presentation on theme: "Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration."— Presentation transcript:

1 Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration

2 Outlines Part I Data Production Data Transfer and Storage Part II Search for gravitational wave pulses and quasi- periodic signals Search for periodic signals Conclusions

3 I - Data Production and Storage

4 Status of Virgo CITF commissioning ended on September 2002 5 Engineering Runs (three-days long) doneCITF commissioning ended on September 2002 5 Engineering Runs (three-days long) done ITF commissioning started in September 2003 (ends in September 2004) 4 Engineering Runs done until nowITF commissioning started in September 2003 (ends in September 2004) 4 Engineering Runs done until now Full Virgo locked before the end of 2004Full Virgo locked before the end of 2004

5 Virgo Data Production 5 different data streams produced: Raw dataRaw data T ime series containing information from the different sub-systems, recorded in 1 sec long frames. Each file is made of 300 frames (  1.8 GByte size). The data flow is 6 MByte/sec. Processed dataProcessed data h-recon, quality channels. Stored in frames 1 sec long. Expected data flow 0.6 MByte/sec. Trend dataTrend data Slowly acquired information, global information, fast quantities. These information are stored in frames 1 hour long. The expected data flow is about 10 kByte/sec. 50 Hz data50 Hz data Fast channels down-sampled @ 50 Hz for long term studies Data flow 140 kByte/sec. Network analysis dataNetwork analysis data Data made available to external collaborations (i.e. LIGO). These data contains environmental data, h-recon, etc. Expected data flow ~ 1 MByte/sec (depending on the agreement on data exchange among the different experiments).

6 Data Transfer & Storage Data Transfer & Storage I – Present SituationCascina VIRGO CNAF LYON bbftp bbftp CASCINA CASCINA: 70 TByte storage (as data buffer for daily activities) + LTO Tapes CNAF CNAF: nas1, nas2 & nas3 9.96 TByte 9.96 TByte full with ER data (from E0 to C3) Asked up to 20 TByte for 2004 Transfer performed by virgo-gateway machine (Dell bi-processor @ 1 GHz) Data flow 3 MByte/sec LYON LYON: Data stored with HPSS (from E0 to C3) Data flow @ 6.4 MByte/sec

7 Data Transfer & Storage Data Transfer & Storage II – Futures PlansCascinaStorage SRM MySQL archive bbftp Server SRM Client C2 Temp. Buffer SRM MySQL archive SRM Client C1 bbftp Server SRM Client C3 SRM Client C2 Cascina Bologna-CNAF Bologna Storage bbftp bbftp to Lyon On-Line To BKDB @ Lyon SRM Client C3 To BKDB @ Lyon

8 Book-Keeping Data-Base Oracle Data-Base. Generated by SRM Client C3 in Cascina, and hosted in Lyon. Replicated both in Bologna and Cascina CascinaBolognaLyonFile Cascina Bologna Lyon File Info. Info. Info. Information Directory 1 yes 0 no/deleted 2 in transfer Directory 1 yes 0 no/deleted 2 in transfer Directory 1 yes 0 no/deleted 2 in transfer Name, Size GPS time DAQ information Event information

9 II - Off-line Analysis Procedures

10 Data analysis Requirements Data analysis Requirements I - Search for bursts & coalescing binary gravitational signals Bursts: Short signals (4  100 ms) of unknown shape, frequencies between 50 Hz and 6000 Hz, and amplitude 10 -25 ≤ h ≤ 10 -20. Specific Burst oriented software developed Specific Burst oriented software developed: Burst Library (BuL): urst Library (BuL): C++ library containing several packages dedicated to the search for burst gravitational waves. BuL is developed on DEC/OSF1 V5.2, Linux/RH 6.1 and Linux/RH 7.2, and all the packages are managed and built using CMT. SNAG (Signal and Noise for Gravitational Antennas): MatLab toolbox containing filters to perform burst searches both in frequency and time domain. SNAG is developed on Windows & Linux (to be completed)

11 Data analysis Requirements Data analysis Requirements I - Search for bursts & coalescing binary gravitational signals Preprocessing for Bursts analysis Whitening: hitening Library dedicated to perform data whitening. There exists a C version (LIB_Whitening original) and a C++ version (Whitening, interfaced with BuL) Ana Batch: : C++ framework which provides some facilities to extract data from Virgo data files (in Frame format). NAP (Noise Analysis Package) C & C++ library containing all the packages dedicated to noise studies and simulations (in development). Typical duration of jobs: 1 hour CPU-time for 1 hour of data samples (on a Xeon bi-processor @ 1.7 GHz with 1.5 GByte RAM) From 1/2 to 1 hour CPU-time for 1/2 hour of data samples on MatLab (Windows), depending on the number of templates and of threshold values Some algorithms need machine cluster (matched filtering with  1000 templates)

12 Data analysis Requirements Data analysis Requirements I - Search for bursts & coalescing binary gravitational signals chirp Coalescing Binary Systems: Compact stars (NS/NS, NS/BH, BH/BH) The exact shape of the signal is accurately predictable, but depends on the two masses of the stars, on their spin rates + several relativistic effects

13 Data analysis Requirements Data analysis Requirements I - Search for bursts & coalescing binary gravitational signals Coalescing Binary Systems:  Matched filtering techniques have been developed, with thousands of banks of filters (Templates average size 4 MByte) (Flat Search) 1.Single frequency band analysis (Flat Search), running with Merlino framework (written on Ansi C, communication based on MPI on a beowulf cluster) (Multi-Band Template Analysis) 2.Two frequency band analysis (Multi-Band Template Analysis), with same templates grids for all frequency bands (Price Algorithm) 3.Dynamic Matched Filter Techniques (Price Algorithm) ALE (Adaptive Line Enhanced filters) 4.Hierarchical strategies using ALE (Adaptive Line Enhanced filters)  Needed high computing power(~ 300 Gflops for in-time analysis, 3 times more for off-line analysis) and needed distribute framework to parallel computation  Needed high computing power (~ 300 Gflops for in-time analysis, 3 times more for off-line analysis) and needed distribute framework to parallel computation

14 Data analysis Requirements Data analysis Requirements I - Search for bursts & coalescing binary gravitational signals Scheme for bursts and coalescing binary detection To be implemented @ Bologna h reconstruction $2 signals @ 20kHz Lines removal Whitening Decimation/Re-sampling Bursts Filters C.B. Filters DataStorage Ev. selected Ev. selected Storage Raw data Storage

15 Data analysis Requirements Data analysis Requirements II - Search for periodic gravitational signals Periodic gravitational signals are emitted, e.g., by asymmetric rotating neutron stars. Amplitude of the signals very low  long integration times (~ months) are needed. Hierarchical strategy has been developed based on the alternation of “coherent” and “incoherent” steps. Tflops range Large computing resources needed for the analysis: Tflops range However, the larger is the CP we can access and the wider is the portion of source parameter space we can explore. Low granularity: the analysis method is well suited to a distributed computing environment. Two main computing centers, Bologna and Lyon, plus Napoli and Roma

16 preliminary analisys input files GRID candidates Candidates copied back to a local machine for further steps of the analysis. Typical output files dimensions ~200kB, ~2∙10 4 candidates. 6 months of data Typical dimensions ~1.2 MB for 6 months of data. Replicated among SEs. 3 months ~ 10 5 jobs sent in 3 months (incoherent steps) Typical job duration ~5-10 hours on a 2.4 GHZ Xeon proc, depending on the source frequency. Performed locally (coherent steps). C.C. Storage

17 We are carrying on test activities on the data analysis software in two computing environments: local batch systems (PBS) and grid (INFN-Grid). Main activities so far: Adaptation of the data analysis procedures to work in a distributed environment; Tests of the “incoherent” part of the analysis pipeline (several software versions) using simulated data (thousands of jobs submitted). Used machines: Roma, Bologna, Napoli (about 30 machines) whithin INFN-Grid Lyon (25 processors) as a classic batch system Full-scale test of the “coherent” part of the analysis (28 processors for ~3 months, 24 hours/day; farms in Bologna and Roma). Results: very good scaling of performances with the number of nodes involved (but only small scale tests done up to now); grid software more and more stable and reliable;

18 Conclusions The Virgo experiment will complete the commissioning in 2004. Data Production: 5 kinds of data will be produced, with data flow from 10 kByte/sec (Trend Data) up to 6 MByte/sec (raw data) Typical raw-data file size 1.8 GByteStorage: 2 permanent storage, Bologna-CNAF and Lyon, + Cascina Automatic processes to transfer data from Cascina to Bologna and from Bologna to Lyon are in development Data Analysis: Several filters have been developed to search for gravitational waves, all the filtering techniques need for high computing power and parallel computations. 4 M.D.C. (productions) performed until now, next foreseen in June. GRID tests have been performed using Roma, Bologna and Napoli farms. Larger scale tests will be performed in next months. The analysis of scientific data will start in 2005. The analysis of scientific data will start in 2005.

19

20

21

22 Merlino Framework Distributed framework for data a parallel data analysis Is composed of 4 main processes Written in ANSI C code, communication based on MPI and running on a Beowulf cluster “plug-ins” functions customization (dynamic library) Data flow customization Plug-in actually used, tested of under develop: –Matched Filter –Inspiral generator –Mean Filter –PC –Dumped SineFilter By Leone B.Bosi

23 Next steps (in 2004) integration and validation of the whole analysis software; larger scale grid tests (up to ~100 processors and more involved);

24 Lyon INFN-GRID Virgo CESE GIIS GRIS CESE GIIS GRIS MDS Virgo-I MDS Virgo-F CESE GIIS GRIS Virgo-I Cnaf Virgo-F Lione- GIIS RLS @ Cnaf Scenario 1 Virgo BDII RB Virgo-I Roma CESE GIIS GRIS Virgo-I Napoli CESE GIIS GRIS Virgo-F ….- Virgo BDIIRB by Antonia Ghiselli


Download ppt "Problemi e strategie relativi al calcolo off-line di VIRGO Laura Brocco Universita’ di Roma “La Sapienza” & INFN Roma1 for the VIRGO collaboration."

Similar presentations


Ads by Google