PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.

Slides:



Advertisements
Similar presentations
Automating with Open Source Testing Tools Corey McGarrahan rSmart 01-July-08.
Advertisements

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Blueprint RTAGs1 Coherent Software Framework a Proposal LCG meeting CERN- 11 June Ren é Brun ftp://root.cern.ch/root/blueprint.ppt.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Data Management for Physics Analysis in PHENIX (BNL, RHIC) Evaluation of Grid architecture components in PHENIX context Barbara Jacak, Roy Lacey, Saskia.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Tech talk 20th June Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment.
3 Sept 2001F HARRIS CHEP, Beijing 1 Moving the LHCb Monte Carlo production system to the GRID D.Galli,U.Marconi,V.Vagnoni INFN Bologna N Brook Bristol.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
MySQL and GRID Gabriele Carcassi STAR Collaboration 6 May Proposal.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
PNPI HEPD seminar 4 th November Andrey Shevel Distributed computing in High Energy Physics with Grid Technologies (Grid tools at PHENIX)
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
Bookkeeping Tutorial. Bookkeeping & Monitoring Tutorial2 Bookkeeping content  Contains records of all “jobs” and all “files” that are created by production.
4/20/02APS April Meeting1 Database Replication at Remote sites in PHENIX Indrani D. Ojha Vanderbilt University (for PHENIX Collaboration)
PHENIX Simulation System 1 December 7, 1999 Simulation: Status and Milestones Tarun Ghosh, Indrani Ojha, Charles Vanderbilt University.
Event Data History David Adams BNL Atlas Software Week December 2001.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
HEPD sem 14-Dec Andrey History photos: A. Shevel reports on CSD seminar about new Internet facilities at PNPI (Jan 1995)
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Analysis of the ROOT Persistence I/O Memory Footprint in LHCb Ivan Valenčík Supervisor Markus Frank 19 th September 2012.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
Andrey Meeting 7 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY.
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
PPDG update l We want to join PPDG l They want PHENIX to join NSF also wants this l Issue is to identify our goals/projects Ingredients: What we need/want.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
February 28, 2003Eric Hjort PDSF Status and Overview Eric Hjort, LBNL STAR Collaboration Meeting February 28, 2003.
5/2/  Online  Offline 5/2/20072  Online  Raw data : within the DAQ monitoring framework  Reconstructed data : with the HLT monitoring framework.
STAR C OMPUTING STAR Analysis Operations and Issues Torre Wenaus BNL STAR PWG Videoconference BNL August 13, 1999.
Databases for data management in PHENIX Irina Sourikova Brookhaven National Laboratory for the PHENIX collaboration.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
PHENIX Simulation System 1 January 12, 2000 Simulation: Status for VRDC Tarun Ghosh, Indrani Ojha, Charles Vanderbilt University.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
David Adams ATLAS ATLAS distributed data management David Adams BNL February 22, 2005 Database working group ATLAS software workshop.
A proposal: from CDR to CDH 1 Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo] Liverpool, Aug. 2013NA62 collaboration meeting.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
LHCb GRID Meeting 11/12 Sept Sept LHCb-GRID T. Bowcock 2 AGENDA 9:30 LHCb MC Production –Points SICB Processing Req. Data Storage Data Transfer.
CLRC Grid Team Glenn Patrick LHCb GRID Plans Glenn Patrick LHCb has formed a GRID technical working group to co-ordinate practical Grid.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
1 GlueX Software Oct. 21, 2004 D. Lawrence, JLab.
Status of BESIII Distributed Computing BESIII Collaboration Meeting, Nov 2014 Xiaomei Zhang On Behalf of the BESIII Distributed Computing Group.
Overview of the Belle II computing
U.S. ATLAS Grid Production Experience
Moving the LHCb Monte Carlo production system to the GRID
ALICE Monitoring
Readiness of ATLAS Computing - A personal view
Job workflow Pre production operations:
Near Real Time Reconstruction of PHENIX Run7 Minimum Bias Data From RHIC Project Goals Reconstruct 10% of PHENIX min bias data from the RHIC Run7 (Spring.
Michael P. McCumber Task Force Meeting April 3, 2006
Gridifying the LHCb Monte Carlo production system
The LHCb Computing Data Challenge DC06
Presentation transcript:

PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals Barbara Jacak Stony Brook

Grid use that could help in PHENIX l Data management Replica management to/from remote sites Management of simulated data Replica management within RCF l Job management Simulated events generation and analysis Centralized analysis of summary data at remote sites

Replica management: export to remote sites l Export of PHENIX data (file-based, file size < 2 GB) Send data by network or FedEx-net to Japan, France (IN2P3), Israel and US collaborator sites Network to Japan via APAN using bbftp Network to France, Israel using bbtfp Network within US using bbftp and globus-url-copy Currently transfers initiated & logged by scripts all transfers use NFS-mounted disk buffer (not a problem) l Goals Automate data export and logging into replica catalog aim for “pull” mode Transfer data from convenient site, rather than only the central repository at RCF ; Q/A checks (size, checksums) Inter-site staging utility to allow non-BNL copies

Simulated data management l Simulations are performed at CC-J(RIKEN/Wako),Vanderbilt, UNM, LLNL,USB,WI Will add other sites, including IN2P3 for run3 l Simulated hits data were imported to RCF detector response, reconstruction, analysis at RCF & CC-J Simulation projects managed by C. Maguire actual simulation jobs run by expert at each site Data transfers to RCF initiated by scripts l Goals Automate import/archive/cataloging of simulated data (“push”) Merge data movement with centralized job submission utility Export PHENIX software effectively to allow remote site detector response and reconstruction Collect usage statistics

Replica management within RCF l VERY important short term goal! l Some important PHENIX tools exist Replica catalog + DAQ/production/QA conditions lightweight POSTGRES version as well as Objy logical/physical filename translator, integration into PHENIX framework l Goals Use and optimize existing tools at RCF Investigate implementing Globus middleware support use of file & conditions from catalog relation to GDMP, Magda? database user authentication, firewall issues? Collect statistics for optimization Integrate into job management/submission

Job management l Currently use scripts and batch queues at each site l Have two kinds of jobs we should manage better Simulations User analysis jobs

Requirements for simulation jobs l Job specifications Beam (ion, impact parameter) & particle types to simulate Number of events singles vs. embedding into real events (multiplicity effects) l I/O requirements I=database access for run # ranges, detector geometry O= the big requirement send files to RCF for further processing import hits + DST results to RCF l Job sequence requirements Initially rather small, only interaction is random # seed Eventually: hits generation -> response -> reconstruction l Site selection criteria CPU cycles! Also buffer disk space & access by experts

Requirements for analysis jobs l Job specifications run list (includes Q/A decisions already) ROOT steering macro & analysis module/macro l I/O requirements I=nDST files, possibly several types togetther O=ntuples,histograms,PHENIX data nodes,ROOT trees l Job sequence requirements can require multiple passes on same file or files l Site selection criteria data residence (bandwidth limitations!) batch queue length/CPU cycle availability l analysis is relatively lightweight, information management and getting jobs through the system is the challenge

Summary of job management goals l Create software validation suite for remote sites l Design & implement web based user interface authenticate to (multiple) sites display file/conditions catalog data residence Q/A & other conditions (for user run list selection) automate job submission l Exercise GRID middleware (3 target sites (BNL, USB, UNM) l chain test web portal + GRID middleware l Define desired usage statistics;implement in web portal l exercise by group of “beta testers” extend to more collaborators & sites

So, what’s first? l Data Management Use and optimize existing tools at RCF Integrate ROOT TChains with replica catalog Statistics collection Investigate coupling file catalog to Globus middleware Develop inter-site staging utility with Q/A checks l Job management Create software validation suite for remote sites Define user web portal Exercise GRID middleware (3 target sites (BNL, USB, UNM) important first step for PHENIX