DIANE Project CHEP 03 DIANE Distributed Analysis Environment for semi- interactive simulation and analysis in Physics Jakub T. Moscicki,

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

Maria Grazia Pia Simulation in a Distributed Computing Environment Simulation in a Distributed Computing Environment S. Guatelli 1, A. Mantero 1, P. Mendez.
Monte Carlo simulation for radiotherapy in a distributed computing environment S. Chauvie 2,3, S. Guatelli 2, A. Mantero 2, J. Moscicki 1, M.G. Pia 2 CERN.
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
DIANE Project Seminar on Innovative Detectors, Siena Oct 2002 Distributed Computing in Physics Parallel Geant4 Simulation in Medical.
Model Server for Physics Applications Paul Chu SLAC National Accelerator Laboratory October 15, 2010.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Workload Management Massimo Sgaravatto INFN Padova.
A tool to enable CMS Distributed Analysis
Batch VIP — A backend system of video processing VIEW Technologies The Chinese University of Hong Kong.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Enabling Grids for E-sciencE Medical image processing web portal : Requirements analysis. An almost end user point of view … H. Benoit-Cattin,
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.
DIANE Overview Germán Carrera, Alfredo Solano (CNB/CSIC) EMBRACE COURSE Monday 19th of February to Friday 23th. CNB-CSIC Madrid.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
S. Guatelli, A. Mantero, J. Moscicki, M. G. Pia Geant4 medical simulations in a distributed computing environment 4th Workshop on Geant4 Bio-medical Developments.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
LOGO Development of the distributed computing system for the MPD at the NICA collider, analytical estimations Mathematical Modeling and Computational Physics.
WLCG Overview Board, September 3 rd 2010 P. Mato, P.Buncic Use of multi-core and virtualization technologies.
Status Report on the Validation Framework S. Banerjee, D. Elvira, H. Wenzel, J. Yarba Fermilab 15th Geant4 Collaboration Workshop 10/06/
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Daniele Spiga PerugiaCMS Italia 14 Feb ’07 Napoli1 CRAB status and next evolution Daniele Spiga University & INFN Perugia On behalf of CRAB Team.
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Geant4 GRID production Sangwan Kim, Vu Trong Hieu, AD At KISTI.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Workload Management Workpackage
SuperB and its computing requirements
GWE Core Grid Wizard Enterprise (
Simulation in a Distributed Computing Environment
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
CRESCO Project: Salvatore Raia
Support for ”interactive batch”
Simulation in a Distributed Computing Environment
Wide Area Workload Management Work Package DATAGRID project
Gridifying the LHCb Monte Carlo production system
Simulation in a Distributed Computing Environment
Distributed Simulation with Geant4
Presentation transcript:

DIANE Project CHEP 03 DIANE Distributed Analysis Environment for semi- interactive simulation and analysis in Physics Jakub T. Moscicki, CERN/IT

DIANE Project CHEP 03 The need for distribution do the analysis/simulation job in parallel tasks to speed up the work by using powerful, worldwide distributed computentional resources, acessing the data in mass storage systems otherwise too big to fit on your laptop.

DIANE Project CHEP 03 Practical Example example: simulation with analysis each task produces a file with histograms job result = sum of histograms produced by tasks master-worker model client starts a job workers perform tasks and produce histograms master integrates the results

DIANE Project CHEP 03 Tools at hand: local batch queue clusters/farms of PCs running batch queues use LSF or PBS to submit parallel analysis tasks producing histograms collect and post-process results by hand add all the resulting histogram files > foreach i ( ) > bsub -q 8nh run-worker > end Job is submitted to queue.... >ls LSFJOB_ LSFJOB_ LSFJOB_250975

DIANE Project CHEP 03 Tools at hand: global batch queue federation of clusters also known as a GRID use EDG Resource Broker to submit tasks > dg-job-submit worker.jdl Connecting to host grid014.ct.infn.it, port 7771 Logging to host grid014.ct.infn.it, port ****************************************************************************************** JOB SUBMIT OUTCOME The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobId) is: - ******************************************************************************************

DIANE Project CHEP 03 Comments using middleware directly requires a lot of manual work integration of task results keeping track of failed task and resubmiting workers not easy to monitor the job progress and cancel jobs only one task per worker very inefficient if worker initialization time is long

DIANE Project CHEP 03 User Wishlist automatic integration of task results monitoring of job progress and individual tasks automatic error-recovery policies granularity of the size of the task may change independently of the number of workers -- natural load-balancing and optimization of performance performance fine tuning – workers may be mapped to threads, processed or machines depending on the context uniform, transparent and easy user interface and API hiding complexity of underlying middleware mechanisms the same API and UI is used when running local jobs and GRID jobs batch, interactive and semi-interactive operation mode

DIANE Project CHEP 03 Wishlist (cntd) a lightweight “add-on” framework which drives the execution of parallel jobs in master worker model over any specific middleware implementation: application oriented: target common HEP use cases independent from any particular analysis tool with layered and modular architecture which is easy to adapt to new environment: important for middleware transition integrated in modern scripting environment: e.g. python using standards: e.g. exploit AIDA for analysis making it easy to plug your favourite analysis tool To address these issues DIANE Project was set up in CERN/IT

DIANE Project CHEP 03 DIANE Overview DIANE R&D Project started in 2001 in CERN/IT with very limited resources (~1FTE) collaboration with Geant 4 groups at CERN, INFN, ESA succesful prototypes running on LSF and EDG

DIANE Project CHEP 03 Applications of DIANE Examples of interdisciplinary applications Geant4 simulation and analysis speed-up factor ~ 30 times cern.ch/diane LHC: ntuple analysis and simulation radiotherapy: brachytherapy, IMRT space missions: ESA Bepi Colombo, LISA

DIANE Project CHEP 03 DIANE for HEP workgroup clusters  features  many users, many jobs  diverse applications:  ntuple analysis, simulation,...  interactive... semi-interactive... batch  ~ 100s of machines  dynamic environment  users may submit their analysis code  mixed CPU and I/O intensive  some applications may be preconfigured  general analysis e.g. ntuple projections or experiment specific apps  load balancing important

DIANE Project CHEP 03 DIANE for Simulation in Medical Apps  example: brachytherapy  optimization of the treatment planning by MC simulation  features  CPU intensive  few users, few jobs  one preconfigured application  interactive: seconds.. minutes  ~ 10s of machines  ongoing joint collaboration with G4 and hospital units in Torino, Italy

DIANE Project CHEP 03 DIANE for Simulation in Space Science  LISA: MC simulation for gravitational waves experiment  Bepi Colombo mission: HERMES experiment  features  CPU intensive  big jobs (10 processor-years)  preconfigured applications  batch: days  machines  requirements:  error recovery important  monitoring and diagnostics

DIANE Project CHEP 03 DIANE Prototype and Testing scalability tests 70 worker nodes 140 milion Geant 4 events

DIANE Project CHEP 03 DIANE Screenshot Sun Mar 16 14:58: : DIANE.JobMaster.workerReady : worker 5 now ready Sun Mar 16 14:58: : DIANE.JobMaster.run : number of tasks to finish: 1 len(self.master.job_progress) : 5 len(self.master.ready_workers) : 9 len(self.master.busy_workers) : 1 len(self.master.registered_workers):10 Sun Mar 16 14:58: : DIANE.JobMaster.receiveTaskResult : recieved result, taskid =3 status: ok Processing file task-output2.hbk Adding histogram 10 Adding histogram 20 Scanned all IDs from 0 to 100, other HBOOK ids (if any) were ignored Sun Mar 16 14:58: : DIANE.JobMaster.run : job completed ok, quitting control loop DIANE.JobMaster.notifyJobFinished : starting notification DIANE.JobMaster.notifyJobFinished : deactivating master DIANE.JobMaster.workerReady : master not activated DIANE.JobMaster.sendResultToClient : terminated... terminating JobMaster server process u s 15: % 0+0k 0+0io 5835pf+0w [1] Done start_master

DIANE Project CHEP 03 DIANE Web Interface

DIANE Project CHEP 03 References more informarion: cern.ch/diane aida.freehep.org

DIANE Project CHEP 03 The end