Presentation is loading. Please wait.

Presentation is loading. Please wait.

PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.

Similar presentations


Presentation on theme: "PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals."— Presentation transcript:

1 PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals Barbara Jacak Stony Brook

2 Grid use that could help in PHENIX l Data management Replica management to/from remote sites Management of simulated data Replica management within RCF l Job management Simulated events generation and analysis Centralized analysis of summary data at remote sites

3 Replica management: export to remote sites l Export of PHENIX data (file-based, file size < 2 GB) Send data by network or FedEx-net to Japan, France (IN2P3), Israel and US collaborator sites Network to Japan via APAN using bbftp Network to France, Israel using bbtfp Network within US using bbftp and globus-url-copy Currently transfers initiated & logged by scripts all transfers use NFS-mounted disk buffer (not a problem) l Goals Automate data export and logging into replica catalog aim for “pull” mode Transfer data from convenient site, rather than only the central repository at RCF ; Q/A checks (size, checksums) Inter-site staging utility to allow non-BNL copies

4 Simulated data management l Simulations are performed at CC-J(RIKEN/Wako),Vanderbilt, UNM, LLNL,USB,WI Will add other sites, including IN2P3 for run3 l Simulated hits data were imported to RCF detector response, reconstruction, analysis at RCF & CC-J Simulation projects managed by C. Maguire actual simulation jobs run by expert at each site Data transfers to RCF initiated by scripts l Goals Automate import/archive/cataloging of simulated data (“push”) Merge data movement with centralized job submission utility Export PHENIX software effectively to allow remote site detector response and reconstruction Collect usage statistics

5 Replica management within RCF l VERY important short term goal! l Some important PHENIX tools exist Replica catalog + DAQ/production/QA conditions lightweight POSTGRES version as well as Objy logical/physical filename translator, integration into PHENIX framework l Goals Use and optimize existing tools at RCF Investigate implementing Globus middleware support use of file & conditions from catalog relation to GDMP, Magda? database user authentication, firewall issues? Collect statistics for optimization Integrate into job management/submission

6 Job management l Currently use scripts and batch queues at each site l Have two kinds of jobs we should manage better Simulations User analysis jobs

7 Requirements for simulation jobs l Job specifications Beam (ion, impact parameter) & particle types to simulate Number of events singles vs. embedding into real events (multiplicity effects) l I/O requirements I=database access for run # ranges, detector geometry O= the big requirement send files to RCF for further processing import hits + DST results to RCF l Job sequence requirements Initially rather small, only interaction is random # seed Eventually: hits generation -> response -> reconstruction l Site selection criteria CPU cycles! Also buffer disk space & access by experts

8 Requirements for analysis jobs l Job specifications run list (includes Q/A decisions already) ROOT steering macro & analysis module/macro l I/O requirements I=nDST files, possibly several types togetther O=ntuples,histograms,PHENIX data nodes,ROOT trees l Job sequence requirements can require multiple passes on same file or files l Site selection criteria data residence (bandwidth limitations!) batch queue length/CPU cycle availability l analysis is relatively lightweight, information management and getting jobs through the system is the challenge

9 Summary of job management goals l Create software validation suite for remote sites l Design & implement web based user interface authenticate to (multiple) sites display file/conditions catalog data residence Q/A & other conditions (for user run list selection) automate job submission l Exercise GRID middleware (3 target sites (BNL, USB, UNM) l chain test web portal + GRID middleware l Define desired usage statistics;implement in web portal l exercise by group of “beta testers” extend to more collaborators & sites

10 So, what’s first? l Data Management Use and optimize existing tools at RCF Integrate ROOT TChains with replica catalog Statistics collection Investigate coupling file catalog to Globus middleware Develop inter-site staging utility with Q/A checks l Job management Create software validation suite for remote sites Define user web portal Exercise GRID middleware (3 target sites (BNL, USB, UNM) important first step for PHENIX


Download ppt "PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals."

Similar presentations


Ads by Google