Presentation is loading. Please wait.

Presentation is loading. Please wait.

The High Energy Physics Community Grid Project Inside D-Grid ACAT 07 Torsten Harenberg - University of Wuppertal

Similar presentations


Presentation on theme: "The High Energy Physics Community Grid Project Inside D-Grid ACAT 07 Torsten Harenberg - University of Wuppertal"— Presentation transcript:

1 The High Energy Physics Community Grid Project Inside D-Grid ACAT 07 Torsten Harenberg - University of Wuppertal

2 2/27 D-Grid organisational structure

3 3/27 technical infrastructure Nutzer User API D-Grid resources Grid services Core services Distributed data services D-Grid Services Communities Daten/ Software Distributed computing resources network Security and VO management I/O GAT API Scheduling und Workflow Management Portal (GridSphere based) UNICORE Accounting und Billing Data management Globus Toolkit V4 LCG/gLite Monitoring

4 4/ EDGEGEE LCG R&DWLCG Ramp-up... EGEE 2 HEP CG Okt. HI run Mar-Sep pp run today EGEE 3 ? GridKa / GGUS DGI HEP Grid effords since 2001 DGI 2 D-Grid Initiative ???

5 5/27 LHC Groups in Deutschland Alice: Darmstadt, Frankfurt, Heidelberg, Münster ATLAS: Berlin, Bonn, Dortmund, Dresden, Freiburg, Gießen, Heidelberg, Mainz, Mannheim, München, Siegen, Wuppertal CMS: Aachen, Hamburg, Karlsruhe LHCb: Heidelberg, Dortmund

6 6/27 German HEP institutes participating in WLCG WLCG: Karlsruhe (GridKa & Uni), DESY, GSI, München, Aachen, Wuppertal, Münster, Dortmund, Freiburg

7 7/27 HEP CG participants: Participants: Uni Dortmund, TU Dresden, LMU München, Uni Siegen, Uni Wuppertal, DESY (Hamburg & Zeuthen), GSI Associated partners: Uni Mainz, HU Berlin, MPI f. Physik München, LRZ München, Uni Karlsruhe, MPI Heidelberg, RZ Garching, John von Neumann Institut für Computing, FZ Karlsruhe, Uni Freiburg, Konrad-Zuse-Zentrum Berlin

8 8/27 HEP Community Grid WP 1: Data management (dCache) WP 2: Job Monitoring and user support WP 3:distributed data analysis (ganga) ==> Joint venture between physics and computer science

9 9/27 WP 1: Data management coordination: Patrick Fuhrmann An extensible metadata catalogue for semantical data access: Central service for gauge theory DESY, Humboldt Uni, NIC, ZIB A scaleable storage element: Using dCache on multi-scale installations. DESY, Uni Dortmund E5, FZK, Uni Freiburg Optimized job scheduling in data intensive applications: Data and CPU Co-scheduling Uni Dortmund CEI & E5

10 10/27 WP 1: Highlights Establishing a metadata catalogue for the gauge theory Production service of a metadata catalogue with > documents. Tools to be used in conjunction with LCG data grid Well established in international collaboration Advancements in data management with new functionality dCache could become quasi standard in WLCG Good documentation and automatic installation procedure helps to provide useability for small Tier-3 installations up to Tier-1 sites. High troughput for large data streams, optimization on quality and load of disk storage systems, giving high performant access to tape systems

11 11/27 dCache based scaleable storage element dCache project well established New since HEP CG: Professional product management, i.e. code versioning, packaging, user support and test suits. - single host - ~ 10 TeraBytes - Zero Maintenance - thousands of pools - >> PB Disk Storage - >> 100 File transfers/ sec - < 2 FTEs dCache.ORG

12 12/27 dCache: principle P Backend Tape Storage Streaming Data (gsi)FTP http(g) Posix I/O xRoot dCap Storage Control SRM EIS protocol Engines dCache Controller Managed Disk Storage HSM Adapter dCache.ORG Information Prot.

13 13/27 dCache: connection to the Grid world Storage Element Firewall IN - SITE Compute Element Information System FTS Channels gsiFtp SRM Storage Resource Manager Protocol File Transfer Service dCap/rfio/root OUT - SITE

14 14/27 dCache: achieved goals Development of the xRoot protocol for distributed analysis Small sites: automatic installation and configuration (dCache in 10mins) Large sites (> 1 Petabyte): Partitioning of large systems. Transfer optimization from / to tape systems Automatic file replication (freely configurable)

15 15/27 dCache: Outlook Current usage 7 Tier I centres with up to 900 Tbytes on disk (pre center) plus tape system. (Karlsruhe, Lyon, RAL, Amsterdam, FermiLab, Brookhaven, Nordu Grid) ~ 30 Tier II centres, including all US CMS in USA, planned for US ATLAS. Planned usage dCache is going to be included in the Virtual Data Toolkit (VDT) of the Open Science Grid: proposed storage element in the USA. Planned US Tier I will break the 2 PB boundary end of the year.

16 16/27 HEP Community Grid WP 1: Data management (dCache) WP 2: Job Monitoring and user support WP 3:distributed data analysis (ganga) ==> Joint venture between physics and computer science

17 17/27 WP 2: job monitoring and user support co-ordination: Peter Mättig (Wuppertal) Job monitoring- and resource usage visualizer TU Dresden Expert system classifying job failures: Uni Wuppertal, FZK, FH Köln, FH Niederrhein Job online steering: Uni Siegen

18 18/27 Job monitoring- and resource usage visualizer

19 19/27 Integration into GridSphere

20 20/27 Job Execution Monitor in LCG Motivation 1000s of jobs each day in LCG Job status unknown while running Manual error detection: slow and difficult GridICE,...: service/hardware based monitoring Conclusion Monitor job while running JEM Automatical error detection needed expert system

21 21/27 gLite/LCG Workernode Pre-execution test Script monitoring Information exchange: R-GMA Visualization: e.g. GridSphere Bash Python Experten system for classification Integration into ATLAS Integration into GGUS post D-Grid I:... ? JEM: Job Execution Monitor

22 22/27 JEM - status Monitoring part ready for use Integration into GANGA (ATLAS/LHCb distributed analysis tool) ongoing Connection to GGUS planned

23 23/27 HEP Community Grid WP 1: Data management (dCache) WP 2: Job Monitoring and user support WP 3:distributed data analysis (ganga) ==> Joint venture between physics and computer science

24 24/27 WP 3: distributed data management Co-ordination: Peter Malzacher (GSI Darmstadt) GANGA: distributed ATLAS and LHCb Ganga is an easy-to-use frontend for job definition and management Python, IPython or GUI interface Analysis jobs are automatically splitted into subjobs which are sent to multiple sites in the Grid Data management for in- and output. Distributed output is collected. Allows simple switching between testing on a local batch system and large-scale data processing on distributed resources (Grid) Developed in the context of ATLAS and LHCb Implemented in Python

25 25/27 GANGA schema Storage queues manager outputs catalog query submit files jobs data file splitting myAna.C merging final analysis

26 26/27 PROOF schema catalog Storage scheduler query MASTER PROOF query: data file list, myAna.C files final outputs (merged) feedbacks

27 27/27 DESY, Dortmund Dresden, Freiburg, GSI, München, Siegen, Wuppertal Dortmund, Dresden, Siegen, Wuppertal, ZIB, FH Köln, FH Niederrhein Physics DepartmentsComputer Sciences D-GRID: Germanys contribution to HEP computing: dCache, Monitoring, distributed analysis Effort will continue, 2008: Start of LHC data taking challenge for GRID Concept ==> new tools and developments needed HEPCG: summary


Download ppt "The High Energy Physics Community Grid Project Inside D-Grid ACAT 07 Torsten Harenberg - University of Wuppertal"

Similar presentations


Ads by Google