Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

Similar presentations


Presentation on theme: "Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience."— Presentation transcript:

1 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience

2 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 BNL (2 sites) –~ 1100 CPU –~ 400 TB –LSF batch NERSC-PDSF –~ 500 CPU –~ 150 TB –SGE batch São Paulo –Test cluster 10 CPU, 3 TB – SGE batch Upgrade project ~ 50 CPU and ~ 40 TB

3 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 The size of the raw data STAR Au+Au event statistics (raw) –~ 2-3 MB/event –~ 20-40 events/s –Total 2004 Au+Au 20-30 M events ~ 65 TB Cu+Cu run –~ 70 M events @ 200 GeV –~ 40 M events @ 62 GeV –~ 4 M events @ 22 GeV Plus all the p+p, d+Au and previous runs

4 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 The reconstruction, simulation, etc. Reconstruction –Basically done in BNL –Au+Au is estimated to take 18 months (only 60% is complete) Compare with 1 new run every year –A physics ready production needs ~ 2 production rounds (calibrations, improvements, etc) Simulation and embedding –Done at PDSF –Simulation is transferred to BNL STAR takes more data that it currently can make available for analysis

5 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 Analysis Real data analysis is done in RCF Simulation and embedding analysis is done in PDSF Small fractions of datasets are scattered over many institutions mainly for analysis development @ PDSF

6 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 Why do we need grid? If STAR wants to keep the production and analysis running in a speed compatible with data taking, other institutions need to share computer power –Next run STAR will take at least one order of magnitude more events than last year –The RCF/PSDF farm does not grow in the same rate The user point of view –More time available for physics Data will be available earlier –More computing power for analysis Analysis will run faster –Submit the jobs from your home institution and get the output in there No need to know where the data is No need to log on RCF or PDSF You manage your disk space

7 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid Three level structure Tier0 sites (BNL) –Dedicated to reconstruction, simulation and analysis Tier1 sites (PDSF) –Runs reconstruction on demand –Receives all the reconstructed files for analysis –Simulations and embedding Tier2 sites (all other facilities, including São Paulo) –Receives a fraction of files for analysis –Eventually runs reconstruction depending on demand

8 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 Needs Reconstruction and file distribution – Tier0 production ALL EVENT files get copied on HPSS at the end of a job Strategy implies dataset IMMEDIATE replication –As soon as a file is registered, it becomes available for “distribution” –2 Levels of data distributions – Local and Global – Local All analysis files are on disks –Notions of distributed disk – Cost effective solution – Global Tier1 (all) and tier2 (partial) sites Cataloging is fundamental –Must know where the files are –The only central connection between users and files –Central and local catalogs Database should be updated right after file transfer Customized scheduler –Find out where data is upon user request –Redirect jobs to cluster where data is saved –Job submission should not be random but highly coordinated with other users requests

9 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 What is STAR doing on grid? For STAR, grid computing is EVERY DAY Production used – Data transfer using SRM, RRS,.. – We run simulation production on the Grid (easy) – Resource reserved for DATA production (still done traditionally) No real technical difficulties Mostly fears related to un-coordinated access and massive transfers – User analysis Chaotic in nature, requires accounting, quota, privilege, etc … Increase interest from some institutions – Already success under controlled conditions

10 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR jobs in the grid

11 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 Accomplishments in the last few months Full database mirrors over many institutions –Hold detector conditions, calibrations, status, etc… –Highly used during user analisys File catalog and scheduler available outside BNL –User can query files and submit jobs using grid Still some pitfalls for general user analysis Integration between sites –Tools to keep grid certificates, batch systems and local catalogs updated –Library distribution automatically done using AFS or local copy (updated in a daily basis) Full integration of the 3 sites (BNL, PDSF and SP) with OSG

12 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 User analysis in the grid STAR analysis schema –99% based on ROOT applications –User develops personal analysis code that process the data Steps to properly submit analysis jobs in the grid –Select the proper cluster in the grid –Transfer and compile the analysis code to that cluster –Use the file catalog to select the files –Run the jobs (as many as necessary) The node the job runs and the number of jobs is defined by the scheduler and depends on the cluster size, number of events and time to process each event. All this information is managed by the file catalog –Transfer the output to the local site Many of these steps are not yet fully functional but progressing fast

13 Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 Current status and to do list The GRID between PSDF and RCF works quite well –Mainly used for simulation jobs São Paulo, BNL and LBL are fully integrated –Libraries, file catalog, scheduler, OSG, etc. –Being used to test user analysis under the grid Activities for the next few months –Integrate the SGE batch system in the grid framework Still some problems with respect to report right numbers to gridCat Problems keeping jobs alive after few hours –Developments of authentication tools RCF (BNL) and PDSF (LBL) are part of DOE labs –User analysis


Download ppt "Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience."

Similar presentations


Ads by Google