Presentation is loading. Please wait.

Presentation is loading. Please wait.

H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 1 EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of.

Similar presentations


Presentation on theme: "H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 1 EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of."— Presentation transcript:

1 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 1 EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of an EGEE generic application Harald Kornmayer, Ariel Garcia (Forschungszentrum Karlsruhe) Toni Coarasa (Max-Planck-Institut für Physik, München) Ciro Bigongiari (INFN, Padua) Esther Accion, Gonzalo Merino, Andreu Pacheco, Manuel Delfino (PIC, Barcelona) Mirco Mazzucato (CNAF/INFN Bologna) in cooperation with MAGIC collaboration

2 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 2 EnablingGrids for E-sciencE Outline Introduction What kind of MAGIC? The idea of a MAGIC Grid Grid added value Expectations vs. reality? Data challenges Experience Conclusion and Outlook

3 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 3 EnablingGrids for E-sciencE Introduction: The MAGIC Telescope Ground based Air Cerenkov Telescope Gamma ray: 30 GeV - TeV LaPalma, Canary Islands ( 28° North, 18° West ) 17 m diameter operation since autumn 2003 (still in commissioning) Collaborators: IFAE Barcelona, UAB Barcelona, Humboldt U. Berlin, UC Davis, U. Lodz, UC Madrid, MPI München, INFN / U. Padova, U. Potchefstrom, INFN / U. Siena, Tuorla Observatory, INFN / U. Udine, U. Würzburg, Yerevan Physics Inst., ETH Zürich Physics Goals: Origin of VHE Gamma rays Active Galactic Nuclei Supernova Remnants Unidentified EGRET sources Gamma Ray Burst

4 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 4 EnablingGrids for E-sciencE ~ 10 km Particle shower Ground based γ-ray astronomy ~ 1 o Cherenkov light ~ 120 m Gamma ray GLAST (~ 1 m 2 ) Cherenkov light Image of particle shower in telescope camera reconstruct: arrival direction, energy reject hadron background

5 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 5 EnablingGrids for E-sciencE MAGIC – Why the Grid? MAGIC is an international collaboration Partners distributed all over Europe Amount of data can NOT be handled by one partner only (up to 200 GB per night) Access to data and computing needs to be more efficient MAGIC will build a second telescope MAGIC is an international collaboration Partners distributed all over Europe Amount of data can NOT be handled by one partner only (up to 200 GB per night) Access to data and computing needs to be more efficient MAGIC will build a second telescope Analysis is based on Monte Carlo simulations CORSIKA code CPU consuming 1 night of hadronic background needs 20000 days on 70 computer Lowering the threshold of MAGIC telescope requires new methods based on MC simulations More CPU power needed! Analysis is based on Monte Carlo simulations CORSIKA code CPU consuming 1 night of hadronic background needs 20000 days on 70 computer Lowering the threshold of MAGIC telescope requires new methods based on MC simulations More CPU power needed!

6 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 6 EnablingGrids for E-sciencE Developments - Requirements MAGIC needs a lot of CPU to simulate the hadronic background to explore the energy range 10 GeV – 100 GeV MAGIC needs a coordinated effort for the MonteCarlo production MAGIC needs an easy accessible system (Where are the data from run_1002 and run_1003?) MAGIC needs an scalable system (as MAGIC II will come 2007) MAGIC needs the possibility to access data from other experiments (HESS, Vertias, GLAST, PLANCK(?)) for multi wavelength campaigns

7 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 7 EnablingGrids for E-sciencE The infrastructure idea Use three national Grid centers CNAF, PIC, GridKA All the EGEE members Run the central services Connect MAGIC resources to enable collaboration (Get resources for free! ) 2 subsystems MC (Monte Carlo) Analysis Start with MC first!!

8 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 8 EnablingGrids for E-sciencE Development – MC Workflow I need 1.5 million hadronic showers with Energy E, Direction (theta, phi),... As background sample for observation of „Crab nebula“ Run Magic MonteCarlo Simulation and register output data Run Magic Monte Carlo Simulation (MMCS) and register output data Simulate the Telescope Geometry with the reflector program for all interesting MMCS files and register output data Simulate the Starlight Background for a given position in the sky and register output data Simulate the response of the MAGIC camera for all interesting reflector files and register output data Merge the shower simulation and the StarLight simulation and produce a MonteCarlo data sample

9 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 9 EnablingGrids for E-sciencE Implementation 3 main components: meta data base bookkeeping of the requests, their jobs and the data Requestor user define the parameters by inserting the request to the meta data base Executor creates Grid jobs by checking the metadatabase frequently (cron) and generating the input files

10 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 10 EnablingGrids for E-sciencE Grid added value expectations vs. reality : Collaboration (-) Complex software, limited # of OS, limited # of batch systems make the integration of new sites of MAGIC collaborators difficult The final integration of a cluster (SUSE, SGE batch system, AFS, firewall) took too long (9 months) Speed up of MC production (+) The reliable infrastructure and the good support from many sites made that possible! Many thanks to sk, bg, pl, uk, gr, it, es, de, … Service offered was overall good with problems when new releases appeared (every time! :–( ) with problems to have a sustainable configuration (for VO, replica service, …) Central services run by EGEE were stable! expectations vs. reality : Collaboration (-) Complex software, limited # of OS, limited # of batch systems make the integration of new sites of MAGIC collaborators difficult The final integration of a cluster (SUSE, SGE batch system, AFS, firewall) took too long (9 months) Speed up of MC production (+) The reliable infrastructure and the good support from many sites made that possible! Many thanks to sk, bg, pl, uk, gr, it, es, de, … Service offered was overall good with problems when new releases appeared (every time! :–( ) with problems to have a sustainable configuration (for VO, replica service, …) Central services run by EGEE were stable!

11 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 11 EnablingGrids for E-sciencE Grid added value II expectations vs. reality II: Persistent storage (+) of Monte Carlo data Some problems during the first runs –(Too many small files on a tape system is equal to /dev/null). We learnt that lesson! of observation data Started the automated production data transfer of real observation data from LaPalma to PIC, Barcelona in november 2001 3.2 TB of real data are available on the Grid now Improvements of data availability (?) Replica mechanisms needs to be tested! Measurements needed in the future! Ongoing work! expectations vs. reality II: Persistent storage (+) of Monte Carlo data Some problems during the first runs –(Too many small files on a tape system is equal to /dev/null). We learnt that lesson! of observation data Started the automated production data transfer of real observation data from LaPalma to PIC, Barcelona in november 2001 3.2 TB of real data are available on the Grid now Improvements of data availability (?) Replica mechanisms needs to be tested! Measurements needed in the future! Ongoing work!

12 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 12 EnablingGrids for E-sciencE Grid added value III expectations vs. reality III: Cost reduction (-) additional implementations were necessary (-) MAGIC implemented its own prototype meta data base system –to monitor the status of many jobs of a mass production –to check the “status” of a job! (  later) MAGIC implemented its own rudimentary workflow system –Nothing was available at the beginning GGUS reduced the costs definitely (+) MAGIC Grid participants appreciated the support structure of the GGUS portal Every new middleware release forced (-) a downtime of the system customization of the system expectations vs. reality III: Cost reduction (-) additional implementations were necessary (-) MAGIC implemented its own prototype meta data base system –to monitor the status of many jobs of a mass production –to check the “status” of a job! (  later) MAGIC implemented its own rudimentary workflow system –Nothing was available at the beginning GGUS reduced the costs definitely (+) MAGIC Grid participants appreciated the support structure of the GGUS portal Every new middleware release forced (-) a downtime of the system customization of the system

13 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 13 EnablingGrids for E-sciencE Data challenges Past experience -Three MMCS data challenges: - Mar/Apr 2005: 10% failure - July 2005: 3.9%failure - Sept 2005: 3.4% failure - Improvements: -Underlying Middleware -Operation of Services -Many lessons learnt - Data management - Additional checks Last data challenge -December - today -Successful: 13500 - Success (Data available) Mmcs output registered in the Grid -FAILED: 4567 () - Done (Failed): 249 - Done (Success): 2830 - Scheduled: 86 - Submitted: 9 - Aborted: 930 - Waiting: 473 ?

14 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 14 EnablingGrids for E-sciencE Useless status of jobs Data storage site is selected by the JDL ….OutputData = { [OutputFile="data/cer012345" ; LogicalFileName="lfn:mmcs_cer012345\“; StorageElement=" castorgrid.pic.es“; ], ….. The WMS should register the file automatically on the Grid! BUT: If the job fails (RLS service down, SE not available,...) the WMS mention the STATUS as “Done (Successful)” „DONE (Successful)“ has NO meaning for the output of data specified in the JDL! A more sophisticated system is necessary for a production system! We developed it for out own! (As every VO?) Can we get a WMS that takes data output into account?

15 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 15 EnablingGrids for E-sciencE Missing VO support in WMS Mass production is managed by one member of the VO VO production manager No need to be a Grid expert! Every job is assigned to him exclusively! edg-job-submit -- vo magic mmcs_012345.jdl NO other member of the VO can get information about the status of the job edg-job-status https://theUniqueIdentifierOfTheJob about the stdout/stderr of the job edg-job-get-output https://theUniqueIdentifierOfTheJob The basic commands MUST have more VO support!

16 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 16 EnablingGrids for E-sciencE Meta data base The output data files should stored and registered on the Grid! But the files are only useful if “content describing” information can be attached to the files! “From Storage to knowledge!” “from Grid to e-Science” We implemented a “separate” meta data base that links this information and the file URI  One extensible framework for replica and meta data services would be nice!

17 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 17 EnablingGrids for E-sciencE Workflows The MAGIC Monte Carlo system is a good example for a scientific workflow 1000 jobs can be started in parallel (embarrassingly!) MAGIC looked for a middleware tool which support workflows Using standard workflow description Support for self recovery of failed jobs 3% of jobs “fail”  30 out of 1000 Without this feature NO workflow will succeed! There are tools around but we need something like a “best practise guide” for one tool! We don’t want to program it by our own on top of the meta data base!

18 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 18 EnablingGrids for E-sciencE Experience – reliability 2005: Three different data challenges March/april 10,4% successful jobs July 3,8 % successfull jobs September 3.1 % successful jobs  EGEE infrastructure became more reliable! Mass production: Started in December after a training of users at FZK There is always a reason for failure! Deployment is a challenge too! New year Christmas in Spain LCG 2,7

19 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 19 EnablingGrids for E-sciencE MAGIC Grid is reality Production of MC using MAGIC Grid resources started in december! –We plan to ask (temporarily) for more CPUs for stress testing! MAGIC collaboration will put their real data on the Grid The challenges for computing will increase with the second telescope MAGIC Grid is reality Production of MC using MAGIC Grid resources started in december! –We plan to ask (temporarily) for more CPUs for stress testing! MAGIC collaboration will put their real data on the Grid The challenges for computing will increase with the second telescope EGEE – MAGIC Grid

20 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 20 EnablingGrids for E-sciencE MAGIC Grid – future prospects MAGIC is a good example to do e-Science to use the e-Infrastructure to exploit Grid-Technology What is about a „GRID“ of different VHE gamma ray observatories? „Towards a virtual observatory for VHE  -rays“ MAGIC is a good example to do e-Science to use the e-Infrastructure to exploit Grid-Technology What is about a „GRID“ of different VHE gamma ray observatories? „Towards a virtual observatory for VHE  -rays“ HESS/EU/Africa Veritas/US MAGIC/EU Kangaroo - AUS/JP

21 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 21 EnablingGrids for E-sciencE Experience – InputProcOutput How to submit a job?  JDL JDL should specify Get the input form InputSandBox InputData Run the program Executable Store the output at OutputSandBox OutputData InputData OutputData File on UI InputSandBox File on Grid InputData File to UI OutputSandBox File to Grid OutputData OK! No file transfer! OK! Answer from experts: write a script that copy the file a SE to the WN BUT: I don‘t want to implement a WORKAROUND for basic grid functionality

22 H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 22 EnablingGrids for E-sciencE Experience – Execution Data challenge Grid-1 12M hadron events 12000 jobs needed started march 2005 up to now ~ 4000 jobs First tests: with manual GUI submission Reasons for failure Network problems RB problems Queue problems Job successful: Output file registered at PIC Diagnostic: no tools found complex and time consuming  use metadata base, log the failure, resubmit and don‘t care 170/3780 Jobs failed  4.5 % failure


Download ppt "H. Kornmayer MAGIC-GRID Status report EGAAP meeting, Athens, 21th April 2005 - 1 EnablingGrids for E-sciencE Benefits of the MAGIC Grid Status report of."

Similar presentations


Ads by Google