Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.

Similar presentations


Presentation on theme: "Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data."— Presentation transcript:

1 Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data processing and analysis e.g. Experiments now taking data : BaBar (.3 PB/year), D0 (.5 PB/year in 2002), STAR Funded projects  GriPhyN USA NSF, $11.9M + $1.6M  PPDG I USA DOE, $2M  PPDG II USA DOE, $9.5M  EU DataGrid EU $9.3M Just funded or Proposed projects  iVDGL USA NSF, $15M + $1.8M + UK  DTF USA NSF, $45M + $4M/yr  DataTag EU EC, $2M?  GridPP UK PPARC, > $15M Other national projects  UK e-Science (> $100M for 2001- 2004)  Italy, France, (Japan?)

2 Current Data Grid Projects at Fermilab Fermilab, D0-SAM  D0 Run 2 Data Grid. Experiment starting to take data now. 500 Physicists; 80 institutions; 17 Countries 30 Tbytes in the data store.  Started 3 years ago by the Fermilab Run 2 Joint Offline project.  Currently ~100 active users; 8 sites in US and Europe.  http://d0db.fnal.gov/sam http://d0db.fnal.gov/sam PPDG  Three year, ten institution, DOE/SciDAC funded development and integration to deploy data grids for HENP.  D0 and CMS Fermilab data handling groups are collaborating and contributing to the project.  “Vertical Integration” of Grid middleware components into HENP experiments’ ongoing work.  Laboratory for “experimental computer science”  http://www.ppdg.net http://www.ppdg.net

3 D0 SAM -Transparent data replication and caching with support for multiple transports and interface to mass storage systems (bbftp, scp; enstore, sara) -Well developed meta-data catalog for data, job and physics parameter tracking. -Generalized Interface to batch systems (lsf, fbs,pbs,condor). “Economic concepts” to implement collaboration policies for resource management and scheduling. -Interface for physics data set definition and selection -Interfaces to user programs in C++, python. -Support for robust production service with restart facilities, self- managing agent servers; monitoring and logging services throughout. -Support for simulation production as well as event data processing analysis

4 Simplified Database Schema Files ID Name Format Size # Events Files ID Name Format Size # Events Events ID Event Number Trigger L1 Trigger L2 Trigger L3 Off-line Filter Thumbnail Events ID Event Number Trigger L1 Trigger L2 Trigger L3 Off-line Filter Thumbnail Volume Project Data Tier Physical Data Stream Physical Data Stream Trigger Configuration Trigger Configuration Creation & Processing Info Creation & Processing Info Run Event-File Catalog Event-File Catalog Run Conditions Luminosity Calibration Alignment Run Conditions Luminosity Calibration Alignment Group and User information Group and User information Station Config. & Cache info Station Config. & Cache info File Storage Locations File Storage Locations MC Process & Decay MC Process & Decay

5 Remote Station MC Management DB Tables Event Generator Review/ Authorize Establish priorities Assign Work Web Summaries Input Requests Analysis Project Admin Tools SAM Tape Station cache Existing SAM pieces New parts FNAL Data Serving Station cache Station cache Station cache SAM Tape Ocean/Mountain/Prairie MSS Tape Remote pieces D0 Monte Carlo Management System

6 Data Added to SAM 160k Files 25TB

7 SAM on the Global Scale Interconnected network of primary cache stations Communicating and replicating data where it is needed. MSS WAN Stations at FNAL Current active stations FNAL Lyon Fr(IN2P3), Amsterdam Nl(NIKHEF) Lancaster UK Imperial College UK Others in US

8 PPDG – profile and 3 year goals Funding US DOE approved 1/1/3/3/3 $M, 99 – 03 Computer ScienceGlobus (Foster), Condor (Livny), SDM (Shoshani), Storage Resource Broker (Moore) Physics BaBar, Dzero, STAR, JLAB, ATLAS, CMS National LaboratoriesBNL, Fermilab, JLAB, SLAC, ANL, LBNL Universities Caltech, SDSS, UCSD, Wisconsin Hardware/Networks No funding Experiments successfully embrace and deploy grid services throughout their data handling and analysis systems, based on shared experiences and developments. Computer Science groups evolve common interfaces and services so to serve not only a range of High Energy and Nuclear Physics experiment needs but also other scientific communities.

9 PPDG Main areas of work: Extending Grid services:  Storage Resource Management and Interfacing.  Robust File Replication and Information Services.  Intelligent Job and Resource Management.  System monitoring and information capture. End to End applications:  Experiments data handling systems in use now and in the near future  to give real-world requirements, testing and feedback.  Error reporting and response  Fault tolerant integration of complex components Cross-project activities:  Authenitcation and Certificate authorisation and exchange.  European Data Grid common project for data transfer (Grid Data Management Pilot)  SC2001 demo with GriPhyN.

10 Align PPDG milestones to Experiment data challenges, first year:  ATLAS – production distributed data service – 6/1/02  BaBar – analysis across partitioned dataset storage – 5/1/02  CMS – Distributed simulation production – 1/1/02  D0 – distributed analyses across multiple workgroup clusters – 4/1/02  STAR – automated dataset replication – 12/1/01  JLAB – policy driven file migration – 2/1/02 Management /coordination challenge:

11 Example data grid heirachy – CMS Tier 1 and 2 Tier 2s used for simulation production today Tier 1 FNAL T2 3 3 3 3 3 3 3 3 3 3 3 3 Tier 0 (CERN) 4444 3 3 Other Tier 1 centers

12

13 PPDG views cross-project coordination as important: Other Grid Projects in our field:  GriPhyN – Grid for Physics Network  European DataGrid  Storage Resource Management collaboratory  HENP Data Grid Coordination Committee Deployed systems:  ATLAS, BaBar, CMS, D0, Star, JLAB experiment data handling systems  iVDGL – International Virtual Data Grid Laboratory  Use DTF computational facilities? Standards Committees:  Internet2 High Energy and Nuclear Physics Working Group  Global Grid Forum


Download ppt "Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data."

Similar presentations


Ads by Google