10 March 2003 1 Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.

Slides:



Advertisements
Similar presentations
CPSCG: Constructive Platform for Specialized Computing Grid Institute of High Performance Computing Department of Computer Science Tsinghua University.
Advertisements

The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
ATLAS/LHCb GANGA DEVELOPMENT Introduction Requirements Architecture and design Interfacing to the Grid Ganga prototyping A. Soroko (Oxford), K. Harrison.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Workload Management Massimo Sgaravatto INFN Padova.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
Physicists's experience of the EGEE/LCG infrastructure usage for CMS jobs submission Natalia Ilina (ITEP Moscow) NEC’2007.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
A Web 2.0 Portal for Teragrid Fugang Wang Gregor von Laszewski May 2009.
Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC
A Distributed Computing System Based on BOINC September - CHEP 2004 Pedro Andrade António Amorim Jaime Villate.
Tech talk 20th June Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
TRASC Globus Application Launcher VPAC Development Team Sudarshan Ramachandran.
:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
ILDG Middleware Status Chip Watson ILDG-6 Workshop May 12, 2005.
HEPD sem 14-Dec Andrey History photos: A. Shevel reports on CSD seminar about new Internet facilities at PNPI (Jan 1995)
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Andrey Meeting 7 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski.
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
CEOS WGISS-21 CNES GRID related R&D activities Anne JEAN-ANTOINE PICCOLO CEOS WGISS-21 – Budapest – 2006, 8-12 May.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
MTA SZTAKI Hungarian Academy of Sciences Introduction to Grid portals Gergely Sipos
PPDG update l We want to join PPDG l They want PHENIX to join NSF also wants this l Issue is to identify our goals/projects Ingredients: What we need/want.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
EMI INFSO-RI ARC tools for revision and nightly functional tests Jozef Cernak, Marek Kocan, Eva Cernakova (P. J. Safarik University in Kosice, Kosice,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
© Geodise Project, University of Southampton, Workflow Support for Advanced Grid-Enabled Computing Fenglian Xu *, M.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Consorzio COMETA - Progetto PI2S2 UNIONE EUROPEA Grid2Win : gLite for Microsoft Windows Elisa Ingrà - INFN.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
Replica/File Catalog based on RAM SUNY
Alice Software Demonstration
Presentation transcript:

10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY

Andrey March 2003 Overview  Conditions  Aims  Required features  Running projects  Working prototype  Conclusion

Andrey March 2003 Our Conditions  Relatively small physics team (about 10 persons) in chemistry + some 20 in physics department.  Most active part of team is involved into physics analysis.  Needs  Replica/File catalog  Resources status (lookup)  Job submission+  Data moving+  Interfaces (Web, CLI)

Andrey March 2003 Main aims  To install and tune existing advanced GRID program tools to make robust and flexible distributed computing platform for physics analysis for remote physics teams (like SUNY). We need for distributed infrastructure because we wish to have access to as large as possible computing power. Our dream is to keep it with about zero maintenance efforts.  We consider SUNY as more or less typical example. That means our experience could be used at all other remote small teams.

Andrey March 2003 General scheme: jobs are going where data are and to less loaded clusters Stony Brook RAM ?? RCF Main Data Repository Partial Data Replica

Andrey March 2003 Replica/File catalog  Needs to maintain some information about our files in different locations (in our computers, at BNL, etc.). Expected total number of files is about 10**5 – 10**7 (now is about 2*10**4)  Needs to keep the catalog more or less up to date.  We use adopted version of MAGDA (our catalog is available at and try to adopt ARGO (Phenix).

Andrey March 2003 Computing Resource Status and job submission  We need for simple and reliable tool to see current status of available computing resources (graphics and CLI).  After some testing of different Globus versions I have prepared set of simple scripts to use Globus toolkit in our concrete environment.  We are still looking for reliable and flexible graphics interface.

Andrey March 2003 Known systems under development  GRid Access Portal for Physics Applications (GRAPPA) = “a method (portal) for physicists to easily submit requests to run high throughput computing jobs on remote machines.” / also it is interesting /  Clarens: “ The Clarens Remote Dataserver is a wide-area network system for remote analysis of data generated by the Compact Muon Solenoid (CMS) detector at the European Organization for Nuclear Research, CERN” /

Andrey March 2003 Known Systems (cont.)  AliEn  AliEn is a GRID prototype created by the Alice Offline Group for Alice Environment  AliEn consists of: Distributed Catalogue, Authentication Server, Queue Server, Computing Elements, Storage Elements, Information Server  All systems are not trivial, they include many components.  Apparently it is not bad to be sure for base structure first.

Andrey March 2003 Initial Configuration  In our case we used two computing clusters which are available for us  At SUNY (ram); Globus gateway is rserver1.i2net.sunysb.edu  At BNL PHENIX (RCF); Globus gateway is stargrid01.rcf.bnl.gov (thanks to Jerome and Dantong).

Andrey March 2003 Submission Commands  gsub-s job-script  Submit the job to SUNY.  gsub-p job-script  Submit the job to Phenix.  gsub job-script  Submit the job to less loaded cluster.  gsub job-script filename  Submit the job to the cluster where file with name filename is located.

Andrey March 2003 Job Retrieval  gstat [jobID]  To show the status of job jobID.  gjobs-s [qstat parameters]  To get the info about job queue status at SUNY.  gjobs-p [qstat parameters]  To get the job queue status at PHENIX.  gget [jobID]  To get the output from the job output.

Andrey March 2003 Data moving  Our Conditions  From time to time we need to transfer a group of files (from about 10**2 to 10**4 files) in between different locations (in between SUNY and BNL). Apparently we need to keep newly copied files in Replica/File Catalog. Some trace of all our data transfers is required as well.  Now it is realized in two ways (home made set of scripts with using ‘bbftp’) and with our MAGDA  To show SUNY data catalog based on our MAGDA distribution please use “ ”

Andrey March 2003 Minimum Requirements to deploy the Prototype  To deploy the prototype of computing infrastructure for physics analysis somebody needs:  PC, Linux 7.2/7.3 (it was tested);  Globus Tools 2.2.3;  To get two tarballs with scripts (including SUNY distribution for MAGDA): magda-client.tar.gz and gsuny.tar.gz. It is not bad to see  MySql (server if required) + MySql++(client) + perl interface;  To get Globus certificates (through

Andrey March 2003 CONCLUSION  Transition to GRID architecture could only follow the understanding in GRID computing model of all involved people.  Special training sessions for end users are required.  Of course GRID tools have to be publicly available and supported on centralized computing resources (now it is available at RCF).