Presentation is loading. Please wait.

Presentation is loading. Please wait.

GridPP7 - 02 July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.

Similar presentations


Presentation on theme: "GridPP7 - 02 July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford."— Presentation transcript:

1 GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford

2 GridPP July 2003Stefan StonjekSlide 2 Outline Introduction to SAM Internals of a SAM station Design of a SAM station SAM-Grid Architecture D0 reconstruction effort Outlook and Summary

3 GridPP July 2003Stefan StonjekSlide 3 Introduction to SAM SAM: Sequential data Access via Meta-data SAM is a distributed data handling system One SAM station per processing node/cluster/site D0: RAL, IC, Manchester, Lancaster CDF: RAL, Oxford, Glasgow, Scotgrid, UCL, Liverpool

4 GridPP July 2003Stefan StonjekSlide 4 SAM – central vs. decentral Each SAM station has a local file cache Files are transferred from station to station (no central storage, peer to peer) Central database keeps track of all files, metadata, users, etc. in the SAM system No full peer to peer yet Peer to peer with central database

5 GridPP July 2003Stefan StonjekSlide 5 The SAM Station Each station runs one station master process This communicates with the outside world Local SAM processes talk to the station master Station master talks with the central database

6 GridPP July 2003Stefan StonjekSlide 6 A SAM Analysis Project For every new analysis job a new project is created Corresponds to a list of files Project-Master process keeps track of the status of each file in this project A project can have multiple consumers Every file to only one consumer Allow easy processing on farms

7 GridPP July 2003Stefan StonjekSlide 7 SAM File transfers Station initiates file transfers Station keeps track of the needs of all projects transfer files accordingly Stager uses can use different transfer protocols Depends on local and remote configuration Cache content of each station is kept in central database

8 GridPP July 2003Stefan StonjekSlide 8 SAM Station to database communication Station talks to a db-server (=CORBA to SQL translator) ORACLE database Just one client for the database Reduce load to database

9 GridPP July 2003Stefan StonjekSlide 9 Station to Station Transfer File transfer is done station to station Several possible transfer protocols Negotiated between stations Each station has its own cache Location information from central database

10 GridPP July 2003Stefan StonjekSlide 10 Grid Job and Information Management (JIM) Counterpart for the data handling system (SAM) Based on existing tools (Globus, Condor etc.) Allow brokering based on information from the data-handling system

11 GridPP July 2003Stefan StonjekSlide 11 SAM-Grid Architecture

12 GridPP July 2003Stefan StonjekSlide 12 Job Handling Condor for submission and brokering Decision making is based on: Resource information (general and job specific) Job information Decision making is interfaced with data handling middleware not just static resource information allows brokering to include data handling considerations Decision making is entirely in the Condor framework strong promotion of standards interoperability GRAM protocol to transfer job to execution site Authentication via GSI (Grid Security Infrastructure)

13 GridPP July 2003Stefan StonjekSlide 13 Job Management

14 GridPP July 2003Stefan StonjekSlide 14 JIM Monitoring Information Management Resource description for brokering Infrastructure for monitoring Monitors sites, resources and jobs Distributed knowledge Web based information retrival

15 GridPP July 2003Stefan StonjekSlide 15 SAM-Grid Logistics

16 GridPP July 2003Stefan StonjekSlide 16 Outlook: D0 Reprocessing Challenge D0 will reprocess all Run II data 01 st Sep 2003 – 25 th Nov 2003 (86 days), Conference deadline Lions share at D0 remote computing facilities, including RAL, IC, Manchester, Lancaster Karlsruhe, Wuppertal, Lyon, Michigan, NIKHEF etc. SAM to move data, runjob site job management JIM submission and monitoring

17 GridPP July 2003Stefan StonjekSlide 17 Outlook: D0 Reprocessing Challenge (2) 150 million events / 22.5 TByte input data Second level to second level 25 TByte output data SAM routinely handles this data volume Currently mainly on-site of Fermilab First large scale, large volume real data challenge First HEP experiment to reprocess data in distributed fashion

18 GridPP July 2003Stefan StonjekSlide 18 Summary SAM is a distributed data handling system It is used in production JIM allows to broker jobs based on job specific information and dynamic resources GridPP plays a vital role for the development of SAM-Grid


Download ppt "GridPP7 - 02 July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford."

Similar presentations


Ads by Google