RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR.

RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR

2 Jérôme LAURET IWLSC, Kolkata India 2006 Outline The RHIC program, complex and experiments An overview of the RHIC Computing facility  Expansion model  Local Resources, remote usage Disk storage, a “distributed” paradigm  Phenix and STAR STAR Grid program & tools  SRM / DataMover  SUMS  GridCollector Brief overview of the Open Science Grid  STAR on OSG

3 Jérôme LAURET IWLSC, Kolkata India 2006 The RHIC program, complex and experiments An overview of the RHIC Computing facility  Expansion model  Local Resources, remote usage Disk storage, a “distributed” paradigm  Phenix and STAR STAR Grid program & tools  SRM / DataMover  SUMS  GridCollector Brief overview of the Open Science Grid  STAR on OSG

4 Jérôme LAURET IWLSC, Kolkata India 2006 The Relativistic Heavy Ion Collider (RHIC) complex & experiments A world-leading scientific program in Heavy-Ion and Spin program  The Largest running NP experiment  Located in Long Island, New York, USA Flexibility is key to understanding complicated systems  Polarize protons sqrt(s) = 50-500 GeV  Nuclei from d to Au, sqrt(s NN ) = 20-200 GeV Physics runs to date  Au+Au @ 20, 62, 130, 200 GeV  Polarized p+p @ 62, 200 GeV  D+Au @ 200 GeV RHIC It is becoming the world leader in the scientific quest toward understanding how mass and spin combine into a coherent picture of the fundamental building blocks nature uses for atomic nuclei. It is also providing unique insight into how quark and gluons behaved collectively at the very first moment our universe was born.

5 Jérôme LAURET IWLSC, Kolkata India 2006 The experiments PHENIX BRAHMS &PP2PPPHOBOS STAR 1.2 km RHIC

7 Jérôme LAURET IWLSC, Kolkata India 2006 The RHIC Computing Facility (RCF) RHIC Computing Facility (RCF) at BNL  Tier0 for the RHIC program Online recording of Raw data  Production reconstruction of all (most) Raw data  Facility for data selection (mining) and analysis  Long term archiving and serving of all data … but not sized for Monte Carlo generation  Equipment refresh funding (~25% annual replacement) Addressing obsolescence Results in important collateral capacity growth

8 Jérôme LAURET IWLSC, Kolkata India 2006 Tier1, Tier2, … remote facilities Remote Facilities  Primary source of Monte Carlo data  Significant analysis activity (equal in the case of STAR)  Such sites are operational – the top 3 STAR NERSC/PDSF, LBNL Wayne State University Sao Paolo PHENIX RIKEN, Japan Center for High Performance Computing, University of New Mexico VAMPIRE cluster, Vanderbilt University Grid Computing  Promising new direction in remote (distributed) computing  STAR and, to a lesser extent, PHENIX are now active in Grid computing

9 Jérôme LAURET IWLSC, Kolkata India 2006 Key sub-systems Mass Storage System  Hierarchical Storage Management by HPSS  4 StorageTek robotic tape silos ~4.5 PBytes  40 StorageTek 9940b tape drives ~1.2 GB/sec  Change to technology to LTO drive this year CPU  Intel/Linux dual racked processor systems  ~ 2300 CPU’s for ~1800 kSPECint2000  Mixed of Condor & LSF based LRMS Central Disk  170 TBytes of RAID 5 storage  Other storage solution: PANASAS, …  32 Sun/Solaris SMP NFS servers ~1.3 GByte/sec Distribute disk ~ 400 TBytes  x2.3 more than centralized storage !!!

10 Jérôme LAURET IWLSC, Kolkata India 2006 How does it look like … Not like these … although …

11 Jérôme LAURET IWLSC, Kolkata India 2006 MSS, CPUs, Central Store … but like these or similar (the chairs do not seem more comfortable)

12 Jérôme LAURET IWLSC, Kolkata India 2006 Data recording rates Run4 set a first record STAR PHENIX 120MBytes/sec

13 Jérôme LAURET IWLSC, Kolkata India 2006 DAQ rates comparative A very good talk from Martin Putchke - CHEP04 Concepts and technologies used in contemporary DAQ systems ~ 25 ~ 100 ~ 40 150 MB/sec ~ 300 ~1250 Heavy Ion Experiments are in the > 100 MB/sec range All in MB/sec, approximate... STAR moving to x 10 capabilities in outer years (2008+)

14 Jérôme LAURET IWLSC, Kolkata India 2006 Mid to long term computing needs Computing projection model  Goal is to estimate CPU, disk, mass storage and network capacities  Model based on raw data scaling  Moore’s law used for cost recession Feedback from the experimental groups  Annual meetings, model refined if necessary (has been stable for a while)  Estimate based on beam use plans May be offset by experiment, by year Integral consistent  Maturity factor for codes  Number of reconstruction passes  “richness” factor for the data (density of interesting events)

15 Jérôme LAURET IWLSC, Kolkata India 2006 Projected needs STAR Phenix

16 Jérôme LAURET IWLSC, Kolkata India 2006 Discussion of model Data amount is accurate at 20% close  i.e. model was adjusted to 20% lower-end  Upper-end has larger impact in the outer years  DAQ1000 for STAR enabling Billions of event capabilities a major (cost) factor driven by Physics demand Cost will be beyond current provision  Tough years start as soon as 2008  Gets better in the outer years (Moore’s law catches up)  Uncertainties grows with time however … Cost versus Moore’s law  Implies “aggressive” technology upgrades (HPSS for example)  Strategy heavily based on low cost distributed disk (cheap, CE attached)

18 Jérôme LAURET IWLSC, Kolkata India 2006 Disk storage – distributed paradigm  The ratio is striking  x2.3 ratio now, moves to x6 in outer years  Requires SE strategy CPU shortfall  Tier1 use (Phenix, STAR)  Tier2 user analysis and data on demand (STAR)

19 Jérôme LAURET IWLSC, Kolkata India 2006 Phenix – dCache model Tier1 / CC-J / RIKEN MSS = HPSS Central stores Tier 0 – Tier 1 model  Provides scalability for centralized storage  Smooth(er) distributed disk model

20 Jérôme LAURET IWLSC, Kolkata India 2006 Phenix – Data transfer to RIKEN Network transfer rates of 700-750 Mbits/s could be achieved (i.e. ~90 MB/sec)

21 Jérôme LAURET IWLSC, Kolkata India 2006 STAR – SRM, GridCollector, Xrootd Different approach Large (early) pool of distributed disks, early adoption of dd model  dd model too home-grown  Did not scale well when mixing dd and central disks Tier 0 – Tier X (X=1 or 2) model  Need something easy to deploy, easy to maintain Leveraging on SRM experience  Data on demand  Embryonic event level (GridCollector)  Xrootd could benefit from an SRM back-end olbdxrootdolbdxrootdolbd xrootd olbd xrootd olbd xrootd olbd xrootd Manager (Head Node) Supervisor (Intermediate Node) Data Server (Leaf Node)

22 Jérôme LAURET IWLSC, Kolkata India 2006 STAR dd Evolution – From this Where does this data go ?? D D D D DataCarousel Client Script adds records Pftp on local disk FileCatalog Management Update FileLocations Mark {un-}available Spider and update * Control Nodes VERY HOMEMADE VERY “STATIC”

23 Jérôme LAURET IWLSC, Kolkata India 2006 STAR dd Evolution – … to that … Entire layer for Cataloguing is gone Layer for restore from MSS to dd gone D D DD Pftp on local disk DATA ON DEMAND XROOTD provides load balancing, possibly scalability, a way to avoid LFN/PFN translation... But does NOT fit within our SRM invested directions... BUT IS IT REALLY SUFFICIENT !!??

24 Jérôme LAURET IWLSC, Kolkata India 2006 Coordination of requests needed Un-coordinated requests to MSS is a disaster  This applies to ANY SE- related tools  Gets worst if the environment combine technologies (shared infrastructure)  Effect of performance is drastic

26 Jérôme LAURET IWLSC, Kolkata India 2006 STAR Grid program - Motivation Tier0 production  ALL EVENT files get copied on HPSS at the end of a production job  Data reduction DAQ to Event to Micro-DST All MuDST are on “disks”  One copy temporarily on centralized storage (NFS), one permanently in HPSS  Script checks consistency (job status, presence of files in one and the other)  If “sanity” checks (integrity / checksum), register files in Catalog Re-distribution  If registered, MuDST may be “distributed” Distributed disk on Tier0 sites Tier1 (LBNL) -- Tier2 sites (“private” resources for now)  Use of SRM since 2003... Strategy implies dataset IMMEDIATE replication  Allows balancing of analysis Tie0 to Tier1  Data on demand enable Tier2 with capabilities

27 Jérôme LAURET IWLSC, Kolkata India 2006 Needed for immediate exploitation of resources Short / medium term strategy  To distribute data  Take advantage of the static data (schedulers, workflow, …) Advanced strategies  Data-on-demand (planner, dataset balancing, data placement …)  Selection of sub-sets of data (datasets of datasets, …)  Consistent strategy (interoperability?, publishing?, ) Less naïve considerations  Job Tracking  Packaging  Automatic Error recovery, Help desk  Networking  Advanced workflow, … SRM / DataMover STAR Unified Meta-Scheduler Xrootd, … GridCollector SRM back-ends Would enable Xrootd with Object on demand Will leverage existing to come to existence middleware or address one by one …

28 Jérôme LAURET IWLSC, Kolkata India 2006 SRM / DataMover SRMs are middleware components whose function is to provide dynamic space allocation file management of shared storage components on the Grid SRM Enstore JASMine Client USER/APPLICATIONS Grid Middleware SRM dCache SRM Castor SRM Unix-based disks SRM SE CCLRC RAL http://osg-docdb.opensciencegrid.org/0002/000299/001/GSM-WG-GGF15-SRM.pphttp://osg-docdb.opensciencegrid.org/0002/000299/001/GSM-WG-GGF15-SRM.pp t

29 Jérôme LAURET IWLSC, Kolkata India 2006 SRM / DataMover Layer on top of SRM In use for BNL/LBNL data transfer for years  All MuDST moved to Tier1 this way  Extremely reliable “Set it, and forget it !” Several 10k files transferred, multiple TB for days, no losses  Project was (IS) extremely useful, production usage in STAR  Data availability at remote site as it is produced We need this NOW Faster analysis is better science and sooner Data safety Caveat/addition in STAR: RRS (Replica Registration Service)  250k files, 25 TB transferred AND Catalogued  100% reliability  Project deliverables on-time

30 Jérôme LAURET IWLSC, Kolkata India 2006 SRM / DataMover – Flow diagram Being deployed at  Wayne State University  Sao Paolo DRM used in data analysis scenario as light weight SE service (deployable on the fly)  All the benefits from SRM (advanced reservation, …)  If we know there IS a storage space, we can take it  No heavy duty SE deployment NEW

31 Jérôme LAURET IWLSC, Kolkata India 2006 CE/SE decoupling Srm-copy from execution site DRM back to submission site Submission site DRM is called from execution site WN  Requires outgoing, but not incoming, connection on the WN  Srm-copy callback disabled (asynchronous transfer)  Batch slot released immediately after srm-copy call  Final destination of files is HPSS or disk, owned by user DRM Client /scratch...... DRM cache Submission Site Job execution Site

32 Jérôme LAURET IWLSC, Kolkata India 2006 SUMS – The STAR Unified Meta- Scheduler STAR Unified Meta-Scheduler  Gateway to user batch-mode analysis  User writes an abstract job description  Scheduler submits where files are, where CPU is,...  Collects usage statistics  User DO NOT need to know about the RMS layer Dispatcher and Policy engines  DataSet drive - Full catalog implementation & Grid-aware  Throttles IO resources, avoid contentions, optimizes on CPU Avoid specifying data location …

33 Jérôme LAURET IWLSC, Kolkata India 2006 SUMS – The STAR Unified Meta- Scheduler STAR Unified Meta-Scheduler  Gateway to user batch-mode analysis  User writes an abstract job description  Scheduler submits where files are, where CPU is,...  Collects usage statistics  User DO NOT need to know about the RMS layer Dispatcher and Policy engines  DataSet drive - Full catalog implementation & Grid-aware  Throttles IO resources, avoid contentions, optimizes on CPU BEFORE – VERY choppy As NFS would impact computational performances AFTER modulo remaining farm instability, smoother

34 Jérôme LAURET IWLSC, Kolkata India 2006 SUMS – The STAR Unified Meta- Scheduler, the next generation … NEW FEATURES  RDL in addition of U-JDL  Testing grid submission is OVER. SUMS is production and user analysis ready Light SRM helping tremendously Need scalability test  Made aware of multiple packaging methods (from ZIP archive to PACMAN) Tested for simple analysis, need finalizing mixed archiving technology (detail)  Versatile configuration Site can “plug-and-play” Possibility of Multi-VO support within ONE install An issue since we have multi 10k jobs/day NOW with spikes at 100k (valid) jobs from nervous users …

35 Jérôme LAURET IWLSC, Kolkata India 2006 GridCollector Using an Event Catalog to Speed up User Analysis in Distributed Environment STAR – event catalog …  Based on TAGS produced at reco time  Rest on now well tested and robust SRM (DRM+HRM) deployed in STAR anyhow Immediate Access and managed SE Files moved transparently by delegation to SRM service BEHING THE SCENE  Easier to maintain, prospects are enormous “Smart” IO-related improvements and home-made formats no faster than using GridCollector (a priori) Physicists could get back to physics And STAR technical personnel better off supporting GC It is a WORKING prototype of  Grid interactive analysis framework  VERY POWERFULL Event “server” based (no longer files) GAIN ALWAYS > 1, regardless of selectivity root4star -b -q doEvents.C'(25,"select MuDst where Production=P04ie \ and trgSetupName=production62GeV and magScale=ReversedFullField \ and chargedMultiplicity>3300 and NV0>200", "gc,dbon")'

36 Jérôme LAURET IWLSC, Kolkata India 2006 GridCollector – The next step Can push functionalities “down”  Index BitMap technology in ROOT framework  Make “a” coordinator “aware” of events (i.e. objects) Xrootd a good candidate ROOT framework preferred Both would serve as a demonstrator (immediate benefit to a few experiments …) Object-On Demand: from files to Object Management - Science Application Partnership (SAP) – SciDAC-II  In the OSG program of work as leveraging technologies to achieve goals

38 Jérôme LAURET IWLSC, Kolkata India 2006 The Open Science Grid In the US, Grid is moving to the Open Science-Grid  An interesting adventure comparable similar European efforts  EGEE interoperability at its heart Character of OSG  Distributed ownership of resources. Local Facility policies, priorities, and capabilities need to be supported.  Mix of agreed upon performance expectations and opportunistic resource use.  Infrastructure deployment based on the Virtual Data Toolkit.  Will incrementally scale the infrastructure with milestones to support stable running of mix of increasingly complex jobs and data management.  Peer collaboration of computer and application scientists, facility, technology and resource providers “end to end approach”.  Support for many VOs from the large (thousands) to the very small and dynamic (to the single researcher & high school class)  Loosely coupled consistent infrastructure - “Grid of Grids”.

39 Jérôme LAURET IWLSC, Kolkata India 2006 STAR and the OSG STAR could not run on Grid3  Was running at PDSF, a Grid3 site setup in collaboration with our resources STAR on OSG = a big improvement  OSG for an Open Science, not as strongly LHC sole focus  Expanding to other science: revisit of needs and requirements  More resources  Greater stability Currentely  Run MC on regular basis (nightly tests, standard MC)  Recently focused on user analysis (light weight SRM)  Helped other site deploy OSG stack And it shows … FIRST Functional site in Brazil, Universidade de Sao Paolo, a STAR institution … http://www.interactions.org/sgtw/2005/0727/star_saopaulo_more.html

40 Jérôme LAURET IWLSC, Kolkata India 2006 Summary RHIC computing facility provides adequate resources in the short-term  Model imperfect for long term projections  Problematic years starting 2008, driven by high data throughput & physics demands  Mid-term issue This will impact Tier1 as well, assuming a refresh and planning along the same model Out-sourcing? Under data “stress” and increasing complexity  RHIC experiments have integrated at one level or another distributed computing principles  Data distribution and management  Job scheduling, selectivity, …  STAR intends to Take full advantage of OSG & help bring more institutions into the OSG Address the issue of batch oriented user analysis (opportunistic, …)

RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR.

Similar presentations

Presentation on theme: "RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR.

Similar presentations

Presentation on theme: "RHIC, STAR computing towards distributed computing on the Open Science Grid Jérôme LAURET RHIC/STAR."— Presentation transcript:

Similar presentations

About project

Feedback